nshepperd/flash_attn_jax

Supporting varlen?

Opened this issue · 0 comments

Is there any plan to support varlen interface? It seems useful for batch generation and some kv cache prefetch scheme would enhance the performance.