bigscience-workshop/bigscience

Central place for the engineering/scaling WG: documentation, SLURM scripts and logs, compute environment and data.

ShellNOASSERTION

Issues

Why is deepspeed enabled in the Bloom training script?
#71 opened 7 months ago by robertLiuLinFeng
0
Zero_Stage=1 results in higher TFLOPS?
#48 opened 2 years ago by lhb8125
1
why 384（12*2*16） will be the first time all pipeline stages be filled
#70 opened a year ago by clinjie
0
Where can I download the training script for bloom-7b1?
#69 opened a year ago by robertLiuLinFeng
0
How to get train-splits.txt and valid-splits.txt before training tr11-176B-ml
#66 opened a year ago by robinfang7
1
Files for bias evaluation
#68 opened a year ago by yu202147657
0
where can we get a bloomz-7b1 finetuned checkpoint
#67 opened a year ago by zhangyipin
0
eval opt-175B
#63 opened 2 years ago by henan991201
1
Running Bloom
#52 opened 2 years ago by kamalkraj
5
Requirements to perform inference over the BigScience Bloom model
#62 opened 2 years ago by celsofranssa
0
mC4 sampling & pre-processing
#61 opened 2 years ago by sbmaruf
1
What is the number of epochs of the final training?
#60 opened 2 years ago by cmsflash
0
About training data for 1B3 models
#59 opened 2 years ago by misska1
0
Sharing the 1.3B-Pile@300B model
#46 opened 2 years ago by BlinkDL
0
Wrong tokenizer path in big model config
#32 opened 2 years ago by RomanCast
1
Is the 13B - unmodified Megatron gpt2 - baseline available? ( tr1-13B-base)
#21 opened 3 years ago by ViktorThink
1
can you share the slurm.conf you are using?
#37 opened 2 years ago by OhadRubin
3
make a back up for final training data
#34 opened 2 years ago by stas00
1
Fill in request for the second half of compute
#6 opened 3 years ago by sashavor
0