declare-lab/tango

Is possible an inference with just a RTX 3080 of 10GB?

davidmartinrius opened this issue ยท 3 comments

Hello,

I know it is very little memory, but it is what I have by now.

By default, the demo code won't inference because of cuda out of memory. I tried to reduce the batch size of the inference to just 1, but is not enough.

Do you know a way to reduce the memory consumption running the inference?

I know that the best solution is to upgrade the GPU to a RTX 3090/4090/A6000, but before that I would like to try another way if possible.

Thank you!

David Martin Rius

The required VRAM is around 13GB for full precision inference with a batch size of 1

You can also try Colaboratory for inference: #10

@deepanwayx I suppose full inference precision is 32 bit, correct? If so, did you guys made any test to check whether 16 bit would still deliver good acceptable results?

Yes, the full inference precision is 32-bit. We did not test with 16-bit inference.