/microllm

My own implementation to run inference on local LLM models

Primary LanguagePythonGNU Affero General Public License v3.0AGPL-3.0

Microllm

just the bare basics to run inference on local hardware.

currently working:

  • gguf.py Now it reads the entire gguf file and returns the file locations for the tensor data.

todo:

  • load tensors into model
  • inference