Tests and Performance Tests
Closed this issue · 1 comments
michaelfeil commented
Some ideas for testing:
- It would be great to add some functional tests, e.g. generating some range of random shapes {-1,0,1} and verify function. I'd assume that that due to the packing, the range of shapes might have some limitations (e.g. must be multiple of 4 before packing, etc.)
- It would be great to check if it is actually faster for all of these cases then
torch.tensor(A) @ B
-- it might require some additional options for@triton.autotune
Bonus question: I assume the L2 Cache Optimizations work like in this example?
https://triton-lang.org/main/getting-started/tutorials/03-matrix-multiplication.html#l2-cache-optimizations?
mlinmg commented
We have a test file we're planning to release pretty soon.
At the moment we're still in a beta phase, we're conducting benchmarks while developing to find the best configurations
As for the L2 hit rate optimization you're right!