facebookresearch/lingua
Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.
PythonBSD-3-Clause
Issues
- 13
- 5
How data is sampled?
#55 opened by macabdul9 - 8
mmlu evaluation not working
#28 opened by zhengyang-wang - 4
Val loss log
#29 opened by zhengyang-wang - 15
lm-eval-harness WikiText bug
#46 opened by akhauriyash - 1
Exporting trained models to vLLM?
#54 opened by ryoungj - 3
Grad-Norm spike on transformer depth change
#52 opened by akhauriyash - 3
Distributed Shampoo
#37 opened by Ryu1845 - 1
CLI metrics viz script
#50 opened by tginart - 6
mmlu benchmark scores
#5 opened by SeunghyunSEO - 2
Better Documentation on Resuming
#53 opened by Hprairie - 2
global depth init std factor seems incorrect
#39 opened by SeunghyunSEO - 2
Multi-Node Distributed Issues
#49 opened by Hprairie - 3
Loading from consolidated checkpoint
#36 opened by SpirinEgor - 1
Potential bug in main generate.py
#44 opened by akhauriyash - 1
Mamba config has extra argument
#45 opened by Hprairie - 5
Potential data-processing issue
#42 opened by akhauriyash - 2
Support for HuggingFace tokenizer
#32 opened by zhengyang-wang - 2
Initialize from pretrained checkpoints
#38 opened by ryoungj - 1
Overview of seeds used?
#30 opened by 152334H - 2
Got error while trying `float8`
#34 opened by tiendung - 14
Unable to run debug code
#6 opened by distributedstatemachine - 3
- 1
where can i find a mtp generate block ? (Self-speculative decoding) for learning purpose.
#17 opened by manmay-nakhashi - 5
lingua or touchtune or torchtitan?
#13 opened by youngsheen - 4
- 1
can't download tokenizer
#23 opened by ath3great - 2
Does this liberary contain context parallel to train long-context models?
#15 opened by ZetangForward - 2
A bit philosophical question: Why this instead of HF ecosystem around Trainer?
#7 opened by ViktorooReps - 2
- 1
- 1
License
#1 opened by fakerybakery - 2
More Config Examples
#3 opened by BeeGass - 1
- 1
THANK YOU FOR OPEN-SOURCING
#8 opened by iBibek