DND is a simple tool for profiling and tracing PyTorch programs on NVIDIA GPUs. If uses NVIDIA's NSIGHT Systems and NSIGHT Compute along-side PyTorch 2.0's Dynamo graph capturing system to provide framework operator and CUDA kernel level breakdowns of a Deep Learning application. Check out the Sample App for an example of an instrumented application.
$ pip install dndlprof
$ dnd -- python -m dnd.sample_app