Is comet-llm capable of supporting Flash Attention 2
rajveer43 opened this issue · 1 comments
rajveer43 commented
Description
Flash Attention 2 is a library that provides attention operation kernels for faster and more memory efficient inference and training:
References
jverre commented
Hi @rajveer43
The Comet LLM SDK is used to log prompts, responses and chains to the Comet platform so that users can easily keep track and review the performance of their LLM models.
I'm not sure I understand how we could integration with Flash Attention 2, can you provide a bit more context in terms of what the integration would entail ?