sgl-project/sglang

[Feature] Correctness test for Triton kernels

Opened this issue 14 days ago · 0 comments

ByronHsu commented 14 days ago

Motivation

The current tests for triton kernels are not ideal.

For extend attention, the test sits in __main__, there are two problems

It compares with prefill attn, which is also a triton kernel
not run in CI

For decode attention, there is no test

Ideally, we should implement a pytorch version and compare the result under the test folder. For example, https://github.com/linkedin/Liger-Kernel/blob/63dd41b15e9f1c2957c817b771536d4ab7119322/test/transformers/test_rms_norm.py#L72

Related resources

No response