/code-llama-32k

Run code-llama with 50k tokens using flash attention and better transformer

Primary LanguageJupyter NotebookMIT LicenseMIT

Stargazers