How to profiling the timing of function calls inside openmp loop
yaoyi92 opened this issue · 0 comments
Dear Caliper developers,
I am working on profiling a MPI+openmp code. For the openmp, we have function calls inside the openmp threads. What's the correct way to get the "runtime-report" for the function calls inside the threads?
Here are my attempts.
If I run the default command CALI_CONFIG=runtime-report srun $SISSOPP
,
I got
Path Min time/rank Max time/rank Avg time/rank Time %
main 0.306833 0.316301 0.313905 23.891259
sis 0.014496 0.019343 0.018768 1.428431
make_shared_featurespace 0.000201 0.002033 0.000828 0.063019
generate_feature_space 0.004758 0.010833 0.008719 0.663625
generate_feats 0.000660 0.006001 0.002509 0.190990
generate_non_param_feats 0.002194 0.003940 0.003267 0.248628
generate_non_param_feats 0.002997 0.006621 0.004622 0.351750
Here, the generate_non_param_feats
is inside openmp loop. In the code, it is always under generate_feats
. However, I saw two generate_non_param_feats
blocks in the output. One in the correct position, while, the other is outside and places out of the main program.
Following the web page https://software.llnl.gov/Caliper/CaliperBasics.html#notes-on-multi-threading
, I also tried CALI_CALIPER_ATTRIBUTE_DEFAULT_SCOPE=process CALI_CONFIG=runtime-report srun $SISSOPP
. However, this time, it gave me the result here.
Path Min time/rank Max time/rank Avg time/rank Time %
main 0.302818 0.339568 0.327741 24.049391
sis 0.014559 0.018591 0.018061 1.192767
make_shared_featurespace 0.000074 0.002291 0.000732 0.053706
generate_feature_space 0.006720 0.011636 0.008599 0.630996
generate_feats 0.000473 0.329705 0.100472 7.372592
generate_non_param_feats 0.002945 0.670026 0.270965 19.883256
generate_non_param_feats 0.001565 0.663913 0.366057 26.860994
generate_non_param_feats 0.000926 0.345460 0.269563 19.780335
generate_non_param_feats 0.000713 0.003053 0.001774 0.117179
It seems caliper is using the same tag for calls from different thread.
Is there any best practices to get the timing of different regions inside openmp threads?
Best wishes,
Yi