bigscience-workshop/bigscience
Central place for the engineering/scaling WG: documentation, SLURM scripts and logs, compute environment and data.
ShellNOASSERTION
Issues
- 0
- 1
Zero_Stage=1 results in higher TFLOPS?
#48 opened by lhb8125 - 0
- 0
- 1
How to get train-splits.txt and valid-splits.txt before training tr11-176B-ml
#66 opened by robinfang7 - 0
Files for bias evaluation
#68 opened by yu202147657 - 0
- 1
eval opt-175B
#63 opened by henan991201 - 5
Running Bloom
#52 opened by kamalkraj - 0
- 1
mC4 sampling & pre-processing
#61 opened by sbmaruf - 0
- 0
About training data for 1B3 models
#59 opened by misska1 - 0
Sharing the 1.3B-Pile@300B model
#46 opened by BlinkDL - 1
Wrong tokenizer path in big model config
#32 opened by RomanCast - 1
Is the 13B - unmodified Megatron gpt2 - baseline available? ( tr1-13B-base)
#21 opened by ViktorThink - 3
can you share the slurm.conf you are using?
#37 opened by OhadRubin - 1
make a back up for final training data
#34 opened by stas00 - 0