yang-song/score_sde_pytorch

Accelerat the Training Process

JiYuanFeng opened this issue · 2 comments

Hi, yang song, thanks for your nice work.

I tried to reproduce the experiment "configs/subvp/cifar10_ncsnpp_continuous.py", which runs on a single V100 with 128 images. However, I found the training is too slow, as of now, 100K iterations consumed around 23 hours.

I want to ask if an experiment with a larger batch size run on multiple GPU can produce the same performance?
At your convenience, would you share with me the config of the multiple GPU experiment of cifar10?

Sincerely thanks for your help.

Yes, the performance shouldn't depend on the number of GPUs used for training. With a batch size of 128, the training should finish in about 3 days on 4 V100s.

The original config was actually already designed for training on multiple GPUs.

Thank you for the information!