Is possible an inference with just a RTX 3080 of 10GB?

Question

Is possible an inference with just a RTX 3080 of 10GB?

davidmartinrius opened this issue 2 years ago · 3 comments

Hello,

I know it is very little memory, but it is what I have by now.

By default, the demo code won't inference because of cuda out of memory. I tried to reduce the batch size of the inference to just 1, but is not enough.

Do you know a way to reduce the memory consumption running the inference?

I know that the best solution is to upgrade the GPU to a RTX 3090/4090/A6000, but before that I would like to try another way if possible.

Thank you!

David Martin Rius

Answer 1 · 2023-07-04T04:59:30.000Z

The required VRAM is around 13GB for full precision inference with a batch size of 1

You can also try Colaboratory for inference: #10

Answer 2 · 2023-07-07T20:09:26.000Z

@deepanwayx I suppose full inference precision is 32 bit, correct? If so, did you guys made any test to check whether 16 bit would still deliver good acceptable results?

Answer 3 · 2023-07-10T05:55:27.000Z

Yes, the full inference precision is 32-bit. We did not test with 16-bit inference.