Questions about memory consumption.
sayez opened this issue · 5 comments
Thank you for your work.
I have a few questions about your paper (also related to DiffI2I) related to the memory consumption of the Stage 1 of your solution(s).
-
What kind/How many GPUS did you use in your experiments?
-
How many GB of GPU memory do you use at each step (per GPU if you distribute batches among several ones).
-
In the paper, you mention that the input of DIRformer for super resolution is 64x64 (and thus, as many tokens thanks to OverlapPatchEmbed, which seems reasonable) whereas the input is 256x256 for inpainting. Isn't the memory consumption 'exploding' with that many transformer tokens?
Thanks in advance.
Use 8xV100(32GB), you can directly try to train Diffir on inpainting for testing. Diffi2i has lighter transformer structure, which consumes even fewer than diffir
May I ask you how much time in hours/days lasts the full training of stage 1 ? (one million training steps according to the papers)
about a week
Thank you!
Use 8xV100(32GB), you can directly try to train Diffir on inpainting for testing. Diffi2i has lighter transformer structure, which consumes even fewer than diffir
Thank you for your excellent work.I would like to know where the code of Diffi2i is?