luissen/ESRT

About the calculate Flops and GPU memory cost

Opened this issue · 1 comments

Thanks foy your excellent works. I have trained on your relased codes, and wants to calculate Flops about your model. As you report in paper, I used 1280x720 to calculate flops, but have a bug about out of memory. And I find that your models seems cost a expensive GPU memory when the image sizes grows. I only can inference 256x256 images X4-SR on a single Titan RTX GPU with 24G memory. Counterintuitively, the models only a single efficient Transformer blocks but cost expensive memory. why this model can not inference on lager images?

Thanks foy your excellent works. I have trained on your relased codes, and wants to calculate Flops about your model. As you report in paper, I used 1280x720 to calculate flops, but have a bug about out of memory. And I find that your models seems cost a expensive GPU memory when the image sizes grows. I only can inference 256x256 images X4-SR on a single Titan RTX GPU with 24G memory. Counterintuitively, the models only a single efficient Transformer blocks but cost expensive memory. why this model can not inference on lager images?

hey bro, in the EHMA module, you used Q and K to calculate attn.
When your img is bigger, maybe (256, 256), the shape of attn is bigger (Batch_size, num_heads, N//4, N//4), where N is the number of patches, cost expensive GPU memory.