Issues
- 1
MNIST single GPU example: GradScaler AssertionError
#168 opened by 152334H - 2
Optimizer compilation fails with PyTorch 2.2
#158 opened by rosario-purple - 4
MS-AMP crashes with DeepSpeed ZeRO 3
#130 opened by rationalism - 1
V0.4 Release Plan
#123 opened by cp5555 - 10
Does this actually work?
#178 opened by tsengalb99 - 4
Optimizer datatype
#170 opened by brianchmiel - 2
Questions about error reporting
#127 opened by Mrzhang-dada - 2
- 3
- 2
- 4
question about the paper
#125 opened by WeiSQ-zju - 3
Support for MS-AMP in FSDP
#122 opened by naveenkumarmarri - 2
- 3
- 0
Integration with PyTorch Lightning
#179 opened by schopra8 - 0
- 8
- 4
Why does using msamp decrease throughput
#175 opened by forevergj - 0
- 2
Clarification: do we need 20 or 16 bytes per parameter when training with Adam + Mixed precision
#173 opened by rodrigo-f-nogueira - 7
Please update obsolete dependencies
#129 opened by rosario-purple - 3
Optimized model seems slower than original
#172 opened by BitCircuit - 1
how can i export the model from pytorch to onnx?
#171 opened by 221588 - 7
Qusetion: FP8 Allreduce
#111 opened by MARD1NO - 1
add topic tag mixed-precision
#164 opened by Beliavsky - 1
[Question]Is MS-AMP going to support ZeRO-2 + PP ?
#154 opened by ohwi - 2
MS-AMP install from source
#133 opened by wpf19911118 - 2
Huggingface Accelerate Support
#128 opened by muellerzr - 2
Is MS-AMP reproducing the FP8-LM paper's results?
#147 opened by xrsrke - 2
Question about FP8 matmul coverage in FP8-LM
#146 opened by stakahashy - 7
Question: Is FP8-LM only supported on H100?
#116 opened by LSC527 - 3
nccl buildig failed without specifying NVCC_GENCODE
#44 opened by tocean - 0
V0.3.0 Test Plan
#107 opened by tocean - 0
V0.3 Release Plan
#92 opened by cp5555 - 4
FP8 in tensor parallel region question
#119 opened by afcruzs - 2
FP8 in linear layer question
#118 opened by afcruzs - 2
Automatic Scaling in the code
#117 opened by afcruzs - 1
Question: Difficulty of FP8 + ZeRO
#108 opened by awgu - 1
Training curve datapoints or smoothing
#115 opened by afcruzs - 2
Question : does it work with Apple mps ?
#110 opened by edmondja - 0
Replace dist_op with fp8_op
#86 opened by tocean - 0
V0.2 Release Plan
#67 opened by cp5555 - 0
V0.2.0 Test Plan
#87 opened by tocean - 4
- 0
unit-test for multi-process training
#61 opened by wkcn - 0
Support FP8 ProcessGroup in pytorch
#50 opened by tocean - 0
[Bug] `LBOptimizer.all_reduce_grads` reduces gradients of only a model, even if training several models
#62 opened by wkcn - 0
- 1
V0.1.0 Test Plan
#51 opened by tocean - 2
Can not run mnist_ddp.py when using pytorch 1.14
#49 opened by tocean