/mini-language-model

Implementing Mamba SSM into a mini language model and training it on the open domain works of Sherlock Holmes. Also, implementation of parallel adapters into a transformer. Finally, code to run a quantized version of Mistral-7B.

Primary LanguageJupyter Notebook

Watchers

No one’s watching this repository yet.