Azure/MS-AMP

Can not run mnist_ddp.py when using pytorch 1.14

tocean opened this issue · 2 comments

tocean commented

What's the issue, what's expected?:
Running mnist_ddp.py in pytorch 1.14 got a failure.

How to reproduce it?:
Start a docker container

docker run -it  -d --name=torch_test--privileged --net=host --ipc=host --gpus=all -v /:/data superbench/dev:cuda11.8 bash
docker exec -it torch_test bash

Please make sure the pytorch version is 1.14.
Install MS-AMP following the README.md and run the mnist_ddp.py example.

Log message or shapshot?:
image

Additional information:

wkcn commented

Related PR: #38

tocean commented

I verified this issue is fixed in #38