master_june24: hack in counters.cc hides a real issue in auto_dsig1.f
Closed this issue · 5 comments
Another issue introduced in #830 and being reviewed in #882.
There was a HACK introduced that was not necessary in master. This should be understood and a proper fix introduced.
Before master_june24 there was no need to implement this 'hack'. Are counters being properly taken care of or this hack hiding underlying issues?
git show fe467b559053fd6caece5494a3c2a2eccf2ebaf8 CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/madgraph/iolibs/template_files/gpu/counters.cc
commit fe467b559053fd6caece5494a3c2a2eccf2ebaf8
Author: Stefan Roiser <stefan.roiser@cern.ch>
Date: Wed May 29 10:35:12 2024 +0200
HACK, avoid division by 0
diff --git a/epochX/cudacpp/CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/madgraph/iolibs/template_files/gpu/counters.cc b/epochX/cudacpp/CODEGEN/PLUGIN/CUDACP>
index 742575a6a..a445f0a02 100644
--- a/epochX/cudacpp/CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/madgraph/iolibs/template_files/gpu/counters.cc
+++ b/epochX/cudacpp/CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/madgraph/iolibs/template_files/gpu/counters.cc
@@ -92,7 +92,7 @@ extern "C"
iimplC + 1,
smatrix1multi_totaltime[iimplC],
smatrix1multi_counter[iimplC],
- smatrix1multi_counter[iimplC] / smatrix1multi_totaltime[iimplC] );
+ smatrix1multi_counter[iimplC] / ( smatrix1multi_totaltime[iimplC] + .1 ));
return;
}
}
Hi,
@valassi, I would say that this is low priority.
Obviously a deeper understanding of such issue is always good and clear our mind, so I would say that this is not a good reason to block the PR. But if you have time to have a look (keeping in mind that this is issue might be hardware specific and therefore might actually also be needed in master/...)
So up to you.
Olivier
Hi,
@valassi, I would say that this is low priority. Obviously a deeper understanding of such issue is always good and clear our mind, so I would say that this is not a good reason to block the PR. But if you have time to have a look (keeping in mind that this is issue might be hardware specific and therefore might actually also be needed in master/...)
So up to you.
Olivier
Hi Olivier thanks :-)
Yes I agree this I will keep for after the merge of PR #882.
I just tried to remov ethe hack and it crashes again, so there is something to understand, see valassi@2b57ee5
Keeping this open
I guess this is related
- [XSECTION] Cross section = 7.667e-07 [7.6668139178203571E-007] fbridge_mode=1
- [UNWEIGHT] Wrote 1700 events (found 1705 events)
- [COUNTERS] PROGRAM TOTAL : 2.6549s
- [COUNTERS] Fortran Overhead ( 0 ) : 1.2980s
- [COUNTERS] CudaCpp MEs ( 2 ) : 1.3569s for 90112 events => throughput is 6.64E+04 events/s
+ [XSECTION] Cross section = 7.616e-07 [7.6161383186590445E-007] fbridge_mode=1
+ [UNWEIGHT] Wrote 1833 events (found 1838 events)
+ [COUNTERS] PROGRAM TOTAL : 2.6924s
+ [COUNTERS] Fortran Overhead ( 0 ) : 1.3124s
+ [COUNTERS] Fortran MEs ( 1 ) : 0.0000s for 90112 events => throughput is 9.01E+05 events/s
+ [COUNTERS] CudaCpp MEs ( 2 ) : 1.3799s for 90112 events => throughput is 6.09E+04 events/s
The Fortran MEs line with 0s should not appear...
This is obviously related. I guess the generated code was not cross-checked, there are two calls to start counters
[avalassi@itscrd90 gcc11/usr] /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/gg_tt.mad/SubProcesses/P1_gg_ttx> git log --oneline -n1
23585b908 (HEAD -> june24, origin/june24) [june24] regenerate all processes - this propagates new reference logs to all .sa
[avalassi@itscrd90 gcc11/usr] /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/gg_tt.mad/SubProcesses/P1_gg_ttx> git diff --no-ext-diff upstream/master auto_dsig1.f
...
-
- IF( FBRIDGE_MODE .LE. 0 ) THEN ! (FortranOnly=0 or BothQuiet=-1 or BothDebug=-2)
+#else
+ INTEGER FBRIDGE_MODE
#endif
- call counters_smatrix1multi_start( -1, VECSIZE_USED ) ! fortran=-1
-!$OMP PARALLEL
-!$OMP DO
+ CALL COUNTERS_SMATRIX1MULTI_START( -1, VECSIZE_USED ) ! fortran=-1
+
+
+
+ IF( FBRIDGE_MODE .LE. 0 ) THEN ! (FortranOnly=0 or BothQuiet=-1 or BothDebug=-2)
+ CALL COUNTERS_SMATRIX1MULTI_START( -1, VECSIZE_USED ) ! fortran=-1
+
...
Since I noticed the wrong printout, it did not take me long to fix the real issue.
The problem was in auto_dsig1.f, there were two calls to starting counters. I fixed this.
This was most likely realted to the attempts to simplify patch.P1. I have restructured a bit the generated code and output.py, now auto_dsig1.f has disappeared from patch.P1 (which I guess is what was the intention of introducing CODEGEN smatrix_multi.f as a templated).
Will be fixed in #882. Closing