Fine-grained benchmark suites for DNN parallelization techniques
- Model: GPT2
- Dataset: Wikitext-2-raw-v1
- #GPU: 2
- Per-device batch size: 4
- Total epoch: 3
- Preprocessing Workers: 8
- Additional params:
- fp16: True
- seed: 42
scripts/train.sh
Criterion | Value |
---|---|
Memory Usage | 13575MiB / 11988 MiB |
Throughput | 3.00it/s |
scripts/accel.sh
Criterion | Value |
---|---|
Memory Usage | 15620iB / 15620 MiB |
Throughput | 3.34it/s |