HazyResearch/ThunderKittens

Tile primitives for speedy kernels

CudaMIT

Issues

import thunderkittens error
#74 opened 17 days ago by klxy0304
0
ERR_NVGPUCTRPERM error when trying to profile
#73 opened 17 days ago by alxndrTL
1
Docs alarm
#68 opened a month ago by alexdremov
1
【question】May I ask if there is a performance comparison for flash_ attention 3？
#61 opened a month ago by gzy19990617
1
[Feature Request] GEMM benchmarks and FP8 Support
#23 opened a month ago by jwfromm
8
ThunderKittens/tests/python/README.md incomplete
#71 opened a month ago by lucifer1004
0
Load with ldmatrix
#27 opened a month ago by liyanc
3
Could you provide a gemm kernel?
#55 opened 2 months ago by ziyuhuang123
1
c++20 does not work?
#45 opened 5 months ago by ziyuhuang123
2
Could you provide a valid mirror?
#57 opened 3 months ago by ziyuhuang123
1
cannot find -lcuda: No such file or directory
#56 opened 3 months ago by ziyuhuang123
0
h100.cu(97): error: "wait" is ambiguous
#54 opened 3 months ago by ziyuhuang123
1
When will ThunderKittens support AMD GPUs, specifically the W7900?
#50 opened 4 months ago by lahmuller
0
Confusing Comment in rt.cuh
#48 opened 4 months ago by KAOZUOI
0
[bug report][4090 attn] cudaCheckError(): too many resources requested for launch
#37 opened 7 months ago by kexve
1
Support for global load/store padding
#44 opened 5 months ago by Hprairie
0
Template error
#43 opened 5 months ago by Hprairie
1
Cross-GPU portability
#42 opened 6 months ago by janEbert
0
Is it possible to support non-contiguous input tensor ?
#41 opened 6 months ago by ProHuper
0
Support `softmax_scale` and `dropout` options for fwd_attend_ker_dim ?
#39 opened 6 months ago by ProHuper
0
Error running make
#40 opened 6 months ago by BurhanUlTayyab
0
Support `softmax_scale` and `dropout` options for fwd_attend_ker_dim ?
#38 opened 6 months ago by ProHuper
0
add suport for a100 atten
#31 opened 7 months ago by MichoChan
0
attn_bias rel-pos support to the FAv2 example
#32 opened 7 months ago by vadimkantorov
1
[Question] Supported compute capabilities?
#21 opened 7 months ago by bayley
3
Support for TPUs?
#35 opened 7 months ago by jaanli
0
[bug report] h100 attn_causal kernel
#33 opened 7 months ago by xiayuqing0622
3
Add support for head dimension 128
#26 opened 7 months ago by perklet
4
why there is no zero(attn) before compute q@k.t in h100 example?
#30 opened 7 months ago by xiayuqing0622
2
Two questions
#25 opened 7 months ago by dongrixinyu
1
unable to reproduce attn_causal speeds
#22 opened 7 months ago by 152334H
3