Examples of training models with hybrid parallelism using ColossalAI
- 2
connection failure
#207 opened by lhj-git - 0
cannot import name 'OPTForCausalLM'
#206 opened by upwindflys - 0
- 0
Outdated OPT example
#203 opened by larry-fuy - 2
- 7
It seems the pipeline parallel document is out of date(https://www.colossalai.org/docs/features/pipeline_parallel)
#195 opened by lambda7xx - 1
detr-debug pipelinable.py
#197 opened by lsx66 - 5
there maybe some bug about the train_gpt.py(https://github.com/hpcaitech/ColossalAI-Examples/blob/main/language/gpt/train_gpt.py)
#196 opened by lambda7xx - 1
Cannot find the gradient handler example
#171 opened by DarrenYing - 1
Too large training loss
#155 opened by qyc-98 - 1
ImportError running detr
#186 opened by LSC527 - 2
question about import model_zoo.gpt.gpt as col_gpt
#193 opened by lambda7xx - 1
- 1
- 1
ImportError: cannot import name 'colo_state_dict' from 'colossalai.utils.model.colo_init_context'
#175 opened by fuhengwu2021 - 3
BERT Data Preprocessing
#172 opened by JizeZhangCS - 5
- 3
[Compatibility] Runining OPT using PyTorch 1.12 and Gemini placement_policy = 'cuda' failed
#166 opened by feifeibear - 3
wikiextractor raise BdbQuit
#108 opened by RenyunLi0116 - 2
Problem with saving model state dict
#156 opened by ouyangliqi - 2
- 1
- 2
no kernel image
#152 opened by qyc-98 - 7
Vision Transformer cifar10 bug
#134 opened by gaow0007 - 5
- 7
ZeRO without using shard_param
#133 opened by powermano - 3
- 7
Failed to run gpt2_3d example
#99 opened by FJRFrancio - 2
#126 opened by 480284856 - 0
Possible example: text to image or image to text
#107 opened by binmakeswell - 0
- 0
Provide relatively small model for ViT
#98 opened by Gy-Lu - 1
'RuntimeError: CUDA error: an illegal memory access was encountered' with large batch size of GPT2-example
#70 opened by Gy-Lu - 2
Provide an Example for Inference
#31 opened by FrankLeeeee - 0
Broken Link to Doc in ViT DP example
#30 opened by FrankLeeeee - 0
Python Exception when running BERT Examples
#46 opened by Wesley-Jzy - 1
Any plan for Swin Transformer?
#76 opened by bityangke - 1
Memory leakage in BERT example
#50 opened by ExtremeViscent - 1
ZeRO 2 configuration example
#71 opened by CHN-ChenYi - 2
failed to run gpt2 zero3 example
#69 opened by CHN-ChenYi - 1
#44 opened by Wesley-Jzy - 1
Overflow in GPT examples
#37 opened by feifeibear - 3
failed to run gpt example
#36 opened by feifeibear - 2
- 0
Invalid Import in GPT Example
#25 opened by FrankLeeeee - 2
Personal Dataset Preprocessing
#24 opened by Lobskodax - 2
PyTorch or TensorFlow
#21 opened by Aadyant12