A model trained on top of normistral-7b (warm) on various Norwegian instruct data with a context length of 2048. The name is derived from Mistral.
Trained with an RTX 4090 for approx 10 hours (2913 steps).
Uses the ChatML format:
"messages": [
{"role": "system", "content": system_message},
{"role": "user", "content": sample["INSTRUCTION"]},
{"role": "assistant", "content": sample["RESPONSE"]},
]
See the simplified inference example in src/nordavind_inference.ipynb.