How to set ai_threshold
Closed this issue · 2 comments
aravindhank11 commented
I am trying to figure what is the appropriate value to set for ai_threshold
. Each kernel has a different knee point and thus wanted to understand what must be set for the model?
fotstrt commented
Hi! In our profiled models, for a fixed GPU type, the knee point at each kernel, as observed by the roofline model at the nsight compute tool: https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html#roofline is very similar for all examined kernels, so we simply selected the average value. Also, most clearly compute- or memory- intensive kernels have arithmetic intensity far different than the knee point so classifying them is pretty clear. I hope that helps!
aravindhank11 commented
Thank you! This helps :)