/mini-language-model

Implementing Mamba SSM into a mini language model and training it on the open domain works of Sherlock Holmes. Also, implementation of parallel adapters into a transformer. Finally, code to run a quantized version of Mistral-7B.

Primary LanguageJupyter Notebook

No issues in this repository yet.