all about CUDA programming, profiling & optimization
This repo is inspired and built upon nvidia-performance-tools but I will continue to expand from there.
- CUDA. My version is 11.7 but should work with any recent versions.
- Nsight Compute & System. Should ship with CUDA installation.
- argparse: It seems to require gcc 8.1+ since my 7.5 lacks charconv file during compilation.
- CPU baseline
- boilerplate for checking correctness and argparse
- GPU baseline
- GPU shared memory baseline
- GPU thread corsening