[Feature] Correctness test for Triton kernels
Opened this issue · 0 comments
ByronHsu commented
Motivation
The current tests for triton kernels are not ideal.
For extend attention, the test sits in __main__
, there are two problems
- It compares with prefill attn, which is also a triton kernel
- not run in CI
For decode attention, there is no test
Ideally, we should implement a pytorch version and compare the result under the test folder. For example, https://github.com/linkedin/Liger-Kernel/blob/63dd41b15e9f1c2957c817b771536d4ab7119322/test/transformers/test_rms_norm.py#L72
Related resources
No response