airsplay/R2R-EnvDrop

Hardware requirements?

Closed this issue · 6 comments

Hi, thanks for your awesome paper and codes.
Recently I'm preparing to follow your work and I'm wondering what hardware it requires and how long it will cost while training?
Thanks!

Emmmm, I have not tested the minimal requirement of computational resource but I believe that GTX 1080 would be enough.

How about the training time?

Speaker and Agent finishes in around 6 hours. Agent + Speaker + EnvDrop converges slowly, which takes around 1 day to converge.

Thank you very much :)

I train Speaker and Agent on two RTX 2080Ti in parallel and it takes 1 day for 80000 iterations. The GPU usage is about 30~40 since sampling data in RL takes time. Half of my CPUs are idle and half of memory are free during training.

I come to ask if you know how to promote training efficiency and what is your suggestion?

Yep, I am aware that the GPUs are not fully utilized. But I am sorry that I do not know the answer.

Previously, the speed bottleneck is CPU-GPU data transfer (it's quite slow on pageable memory) and I optimize it a lot. So I am not sure whether CPU-based sampling would speed it up.