Run code-llama with 50k tokens using flash attention and better transformer
Primary LanguageJupyter NotebookMIT LicenseMIT