LHE file mismatch between fortran and cpp in heft_gg_bb for FPTYPE=f
valassi opened this issue · 3 comments
I have finally run the first tmad test for a HEFT process, heft_gg_bb. This is in WIP PR #832.
All tests succeed for double and mixed precision. There is a mismatch of LHE files in float precision
*** (2-none) Compare MADEVENT_CPP x1 events.lhe to MADEVENT_FORTRAN events.lhe reference (including colors and helicities) ***
ERROR! events.lhe.cpp.1 and events.lhe.ref.1 differ!
diff /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/heft_gg_bb.mad/SubProcesses/P1_gg_bbx/events.lhe.cpp.1 /data/avalassi/GPU2023/madgraph4gpuX/epo>
6206,6207c6206,6207
< 21 -1 0 0 502 503 -0.00000000000E+00 -0.00000000000E+00 -0.59936081260E+01 0.59936081260E+01 0.00000000000E+00 0. -1.
< 5 1 1 2 501 0 0.45273385612E+02 -0.31131305296E+02 0.47763304676E+03 0.48080583916E+03 0.47000000000E+01 0. 1.
---
> 21 -1 0 0 502 503 -0.00000000000E+00 -0.00000000000E+00 -0.59936081260E+01 0.59936081260E+01 0.00000000000E+00 0. 1.
> 5 1 1 2 501 0 0.45273385612E+02 -0.31131305296E+02 0.47763304676E+03 0.48080583916E+03 0.47000000000E+01 0. -1.
8306,8307c8306,8307
< 21 -1 0 0 502 503 -0.00000000000E+00 -0.00000000000E+00 -0.23857997239E+02 0.23857997239E+02 0.00000000000E+00 0. 1.
< 5 1 1 2 501 0 -0.34843521722E+02 0.35239303629E+02 0.13219496682E+02 0.51504607743E+02 0.47000000000E+01 0. -1.
---
> 21 -1 0 0 502 503 -0.00000000000E+00 -0.00000000000E+00 -0.23857997239E+02 0.23857997239E+02 0.00000000000E+00 0. -1.
> 5 1 1 2 501 0 -0.34843521722E+02 0.35239303629E+02 0.13219496682E+02 0.51504607743E+02 0.47000000000E+01 0. 1.
9606,9619d9605
< 4 1 1E-03 0.1250139E+03 0.7546771E-02 0.1235066E+00
< 21 -1 0 0 503 502 0.00000000000E+00 0.00000000000E+00 0.94948250004E+03 0.94948250004E+03 0.00000000000E+00 0. 1.
< 21 -1 0 0 502 503 -0.00000000000E+00 -0.00000000000E+00 -0.41149990002E+01 0.41149990002E+01 0.00000000000E+00 0. -1.
< 5 1 1 2 501 0 -0.96459450317E+01 -0.34409175043E+02 0.83136584965E+02 0.90613560477E+02 0.47000000000E+01 0. -1.
< -5 1 1 2 0 501 0.96459450317E+01 0.34409175043E+02 0.86223091608E+03 0.86298393857E+03 0.47000000000E+01 0. 1.
< <mgrwt>
< <rscale> 0 0.12501391E+03</rscale>
This is strange. It is not a systematic problem. Most events are the same. There are just a few events where the helicities are mismatched, and a few events which are passed in c++ but do not exist for fortran.
Just to be sure, I commented out the hack to flush to zero small jamp in #831. I get the same issues (and no FPEs).
A quick idea about how to investigate this: while it is difficult to debug the fortran in double precision, I know that cudacpp in double precision agrees with that. It may be best to add debug printouts to cudacpp and then run the same test in double and single precision to understand the differences.
Note also: from debugging #831 it is obvious that there are huge cancellations all over the place in this HEFT gg_bb MIW<=1 process. It is not so surprising that float gives different results, maybe.
Maybe we should just decalre that for some processes like this one, double precision (not even moxed) is required. But it would be best to understand how to decide which processes need double precision...