facebookresearch/lingua

Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.

PythonBSD-3-Clause

Issues

act checkpointing OOM, float8 causes CUDA memory allocation retries
#56 opened 2 months ago by Niccolo-Ajroldi
13
How data is sampled?
#55 opened 2 months ago by macabdul9
5
mmlu evaluation not working
#28 opened 2 months ago by zhengyang-wang
8
Val loss log
#29 opened 2 months ago by zhengyang-wang
4
lm-eval-harness WikiText bug
#46 opened 2 months ago by akhauriyash
15
Exporting trained models to vLLM?
#54 opened 2 months ago by ryoungj
1
Grad-Norm spike on transformer depth change
#52 opened 2 months ago by akhauriyash
3
Distributed Shampoo
#37 opened 2 months ago by Ryu1845
3
CLI metrics viz script
#50 opened 2 months ago by tginart
1
mmlu benchmark scores
#5 opened 3 months ago by SeunghyunSEO
6
Better Documentation on Resuming
#53 opened 2 months ago by Hprairie
2
global depth init std factor seems incorrect
#39 opened 2 months ago by SeunghyunSEO
2
Multi-Node Distributed Issues
#49 opened 2 months ago by Hprairie
2
Loading from consolidated checkpoint
#36 opened 2 months ago by SpirinEgor
3
Potential bug in main generate.py
#44 opened 2 months ago by akhauriyash
1
Mamba config has extra argument
#45 opened 2 months ago by Hprairie
1
Potential data-processing issue
#42 opened 2 months ago by akhauriyash
5
Support for HuggingFace tokenizer
#32 opened 2 months ago by zhengyang-wang
2
Initialize from pretrained checkpoints
#38 opened 2 months ago by ryoungj
2
Overview of seeds used?
#30 opened 3 months ago by 152334H
1
Got error while trying `float8`
#34 opened 3 months ago by tiendung
2
Unable to run debug code
#6 opened 3 months ago by distributedstatemachine
14
train.log is rank0 only but actual stdout is all ranks
#31 opened 3 months ago by 152334H
3
where can i find a mtp generate block ? (Self-speculative decoding) for learning purpose.
#17 opened 3 months ago by manmay-nakhashi
1
lingua or touchtune or torchtitan?
#13 opened 3 months ago by youngsheen
5
Failed to Build Wheels for xformers and Compatibility
#19 opened 3 months ago by nhtlongcs
4
can't download tokenizer
#23 opened 3 months ago by ath3great
1
Does this liberary contain context parallel to train long-context models?
#15 opened 3 months ago by ZetangForward
2
A bit philosophical question: Why this instead of HF ecosystem around Trainer?
#7 opened 3 months ago by ViktorooReps
2
stool not working due to sinfo schema compatibility?
#10 opened 3 months ago by zkx06111
2
https://github.com/facebookresearch/lingua.git
#20 opened 3 months ago by Ivanlinsousa
1
License
#1 opened 3 months ago by fakerybakery
1
More Config Examples
#3 opened 3 months ago by BeeGass
2
Are there plans to open source the model weights of llama squared relu?
#11 opened 3 months ago by YixinSong-e
1
THANK YOU FOR OPEN-SOURCING
#8 opened 3 months ago by iBibek
1