evo-design/evo

T4 GPUs support? Any recommendation

ggkngktrk opened this issue · 1 comments

Hello,

First and foremost, thank you for the commendation on our work and paper. I've been attempting to run Evo locally on T4 GPUs, but I encountered an issue with FlashAttn 2.0 not being supported yet. I have a few questions regarding this:

Do you have any plans to support T4 GPUs in the near future?
Will a single 16GB T4 GPU be sufficient for inference? If not, can we implement some optimization processes (with deepspeed) for Hugging Face models?
Is there a way to use FlashAttn 1.x versions, or can we disable Flash-Attn usage directly?
Is it possible to use float16 rather than bfloat16?

Thank you,

exnx commented

Unfortunately, there are no TPU implementations for Flash Attention that we're aware of :(