/microllm

My own implementation to run inference on local LLM models

Primary LanguagePython

Microllm

just the bare basics to run inference on local hardware.

currently working:

  • gguf.py

todo:

  • gguf_v2.py buggy
  • load tensors into model
  • inference