require large memory

Question

require large memory

aijianiula0601 opened this issue 4 years ago · 3 comments

Thanks for your jobs!

my environment:
V100，8gpu，per gpu 32g memoy.

I train with 4 gpu use multiproc and it take away almost 32g each gpu.The batch_szie is set to 32 instead of 4*32=128. Does it need that much memory? Why not switch to dataparallel.Thank you!

Answer 1 · 2020-06-02T11:59:21.000Z

@aijianiula0601
VRAM (GPGPUs) or System Memory (RAM)?

https://github.com/NVIDIA/mellotron/blob/master/distributed.py#L51

The multiproc is just a slightly modified version of Dataparallel, memory usage should be the same (or very close) to pytorch's built-in copy.

Answer 2 · 2020-06-03T11:17:25.000Z

@aijianiula0601
VRAM (GPGPUs) or System Memory (RAM)?

https://github.com/NVIDIA/mellotron/blob/master/distributed.py#L51

The multiproc is just a slightly modified version of Dataparallel, memory usage should be the same (or very close) to pytorch's built-in copy.

I change it to Dataparallel.The batch size can set to 128, but train slow.

Answer 3 · 2020-07-09T21:33:21.000Z

With our implementation, try decreasing the batch size.