Dictionary lookups fail when runtime kernels are absent in the trace
anupambhatnagar opened this issue ยท 2 comments
๐ Describe the bug
The lookup fails if kernels are not there in the trace.
HolisticTraceAnalysis/hta/analyzers/trace_counters.py
Lines 25 to 27 in 0c2146a
HolisticTraceAnalysis/hta/analyzers/cuda_kernel_analysis.py
Lines 323 to 325 in 0c2146a
Steps to reproduce
Use a trace without the cudaMemsetAsync kernel.
Expected behavior
the lookups should work in all cases. the fix is to use .get
with a default response of None
.
Environment
fails on both mac and linux with HTA 0.1.2 and python >= 3.8
Additional Info
No response
+1 this issue, not sure if the profiler has to be configured a certain way.
For reference I am trying to analyze a trace for inference with no ranks. I manually added:
"distributedInfo": {"rank": 0},
to the json trace. Would be nice to have a mode that enables single file analysis.
I found that I can't run
get_queue_length_time_series
, get_queue_length_summary,
Also installing from pip: get_cuda_launch_kernel_info
->
AttributeError: 'TraceAnalysis' object has no attribute 'get_cuda_launch_kernel_info'
Hi @drisspg, thanks for the feedback.
-
There is already a mode which allows the user to specify a single file. See the
trace_files
option in the API. You will need to pass a dictionary to thetrace_files
argument whose key is the rank and value is the full path to the trace file. -
The README will be updated soon with the corrected version. Please replace with
get_cuda_kernel_launch_stats
instead. -
With respect to
get_queue_length_*
please check that the trace file has rank 0. If it has a different rank, then use theranks
argument to pass value. If the error still persists please open a bug issue and provide us the trace file, if possible.
Hope this helps!