/nucleus

Primary LanguageJupyter NotebookMIT LicenseMIT

Nucleus : Small Language Models

Nucleus is a small language model, and it's based on Mistral. Yes, you read it right. It's based on Mistral's architecture and has only 1 billion parameters (or according to our calculations 1.13 billion parameters). Since we already have good SLMs out there such as RWKV (I guess they even have a 300 million parameters model which has good performance), StableLM, TinyLLaMa and Phi, it was a good time to bring a Mistral based SLM to the table as well.

Of course Nuclues is still in its alpha steps, but it has very good potentials. If you have any questions and ideas, feel free to open an issue to this repository or our repository at HuggingFace.

Notebook

Model Notebook
Nucleus 1B Alpha1 Open In Colab

Reproducibility

All weights are available at HuggingFace and using proper mistral training setup, can be reproduced. Also, we suggest you study our model card as well.

Donations

We train and make these models with personal money. We need investment and funding for continue this process, but you know what is cooler than an angry investor? A good community. Being Iranian resident, we are not able to provide PayPal links, but here are crypto wallets for you to show us your generosity.

USDT (TRC20)

TUPW5xX4NJeoBrvX6J9ipifXUyYcXQHGBr

USDT (ERC20)

0xf6c4e2929d25e652299102630e4E7A75dEc9aa5b

TRX (Tron)

TUPW5xX4NJeoBrvX6J9ipifXUyYcXQHGBr

ETH

0xf6c4e2929d25e652299102630e4E7A75dEc9aa5b

BTC

bc1qkwwslu9ses2m05cuhysdtlp2fa5l6r30zzv7jw