V0.7.0 Test Plan
yukirora opened this issue · 0 comments
yukirora commented
Test Cases
single-node test
Machine Type | #Node * #GPU * GPU Type | PyTorch Version | Accelerated Computing Toolkit | Status |
---|---|---|---|---|
ND A100 v4 | 1 * 8 * A100 40GB SXM | PyTorch 1.8 | CUDA 11.1 | Done |
NDm A100 v4 | 1 * 8 * A100 80GB SXM | PyTorch 1.8 | CUDA 11.1 | Done |
Hopper | 1* 8 * H100 | PyTorch 1.x | CUDA11.8 | Done |
single-node Micro-benchmark Test
- tensort-inference
- Fix Transformers version to avoid Tensorrt-inference failure (#441)
- cublas-function/cudnn-function
- mem-bw
- Add wait time option to resolve mem-bw unstable issue (#438)
SuperBench Improvement
Hopper GPU and FP8 related benchmarks
- docker building
- Add CUDA11.8 Docker image for Nvidia arch90 GPUs (#449)
- micro-benchmark
- model-benchmark
- Support FP8 in Bert model training (#446)
New in bug bash
- [x]
- [x]
multiple-node test
Test Table
Machine Type | #Node * #GPU * GPU Type | PyTorch Version | Accelerated Computing Toolkit | Status |
---|---|---|---|---|
NDm A100 v4 | 32 * 8 * A100 80GB SXM | PyTorch 1.8 | CUDA 11.1 | Done |
distributed Micro-benchmark test
- ib-traffic
- nccl-bw
New in bug bash
- [x]
- [x]