sgl-project/sglang

[Feature] Correctness test for Triton kernels

Opened this issue · 0 comments

Motivation

The current tests for triton kernels are not ideal.

For extend attention, the test sits in __main__, there are two problems

  1. It compares with prefill attn, which is also a triton kernel
  2. not run in CI

For decode attention, there is no test

Ideally, we should implement a pytorch version and compare the result under the test folder. For example, https://github.com/linkedin/Liger-Kernel/blob/63dd41b15e9f1c2957c817b771536d4ab7119322/test/transformers/test_rms_norm.py#L72

Related resources

No response