Extremely hacky implementation of Mixtral 8x7B
New: API access
Try a much faster implementation of this model at https://app.fireworks.ai/
What is it?
Mistral dropped the new MoE model this morning: https://twitter.com/MistralAI/status/1733150512395038967
This is an attempt to hack the original Llama codebase to load it. The implementation is very naive and slow.
You need 2 x 80Gb or 4 x 40Gb cards to load it.
Implementation:
- remove model parallelism for simplicity
- shard experts in the specified number of GPUs
- reverse engineer MoE implementation based on https://arxiv.org/pdf/2211.15841.pdf (seems vanilla MoE)
- ambiguity on the ordering of top_k and softmax in MoE. See https://twitter.com/dzhulgakov/status/1733330954348085439 for the discussion.
WARNING: There's no official reference model code. This implementation might be wrong. At least the generation looks coherent which is a good sign :)
Usage
Download the weights for Mixtral from HF or Torrent. HF is the easiest: https://huggingface.co/someone13574/mixtral-8x7b-32kseqlen/tree/main . Make sure you consolidate the weights
Run with 2 GPUs (~45GB required in each):
python example_text_completion.py path/to/mixtral/ path/to/mixtral/tokenizer.model
To run with 4 GPUs pass --num-gpus 4
.
Edit prompt in the example if needed.
Sample output
Mistral hallucinates about Mistral:
Mistral.ai is a company that
> provides a platform for building, training, and deploying AI models.
The platform offers a variety of tools and services that can help developers and data scientists build and train AI models.
Some of the key features of Mistral.ai's platform include:
- A drag-and-drop
==================================
Simply put, the theory of relativity states that
> 1) the laws of physics are the same for all observers in uniform motion relative to one another, and 2) the speed of light in a vacuum is the same for all observers, regardless of their relative motion or of the motion of the light source.
The first postulate, the principle of
==================================
A brief message congratulating the team on the launch:
Hi everyone,
I just
> wanted to say a big congratulations on the launch of your new website.
I think it looks fantastic and I am sure it will be a great success.
Well done everyone and keep up the good work.
Best wishes,
XXXX
==================================
Translate English to French:
sea otter => loutre de mer
peppermint => menthe poivrée
plush girafe => girafe peluche
cheese =>
> fromage
teddy bear => ourson en peluche
polar bear => ours polaire
cuddly panda => panda câlin
fluffy sheep => mouton fluffy
furry kitten => chaton poilu
fuzzy
==================================