Aaryan0404/CUDA
accelerating inference and training for transformer-based models by building cuda kernels that optimally saturate the memory bandwidth and arithmetic capabilities of hopper h100s
Cuda
No issues in this repository yet.
accelerating inference and training for transformer-based models by building cuda kernels that optimally saturate the memory bandwidth and arithmetic capabilities of hopper h100s
Cuda
No issues in this repository yet.