rahulchhabra07/microllm

My own implementation to run inference on local LLM models

Python

Microllm

just the bare basics to run inference on local hardware.

currently working:

gguf.py

todo:

gguf_v2.py buggy
load tensors into model
inference