svalinn/DAGMC

dagmc tet mesh tallies 5.5x higher when run with MPI version of dag-mcnp6.2

bohmt opened this issue · 6 comments

bohmt commented

Describe the Bug

DAGMC based tet mesh tallies produce values about 5.5x higher with the MPI version of dag-mcnp6.2 as compared to the serial version. The factor of ~5.5 happens independently of how many nodes/cores are used.

To Reproduce

Run a dag-mcnp6.2 calculation that includes the dagmc tet mesh tallies using the MPI version of dag-mcnp6.2 and run the same calculation with the serial version of dag-mcnp6.2 Then, compare the dagmc tet mesh tally output (recall you will need to convert from h5m to vtk format with mbconvert to easily display or read the output).
A simple example with input files and the diff of the vtk dagmc tet mesh tally output file is provided at the UW Box file sharing site: https://uwmadison.box.com/s/zcui19tsc0wepq80icsrva8ampwwvwjr
Note the left panel of the diff output is the MPI case and the right panel is the serial case.

Expected Behavior

We expect that the dagmc tet mesh tally values are the same regardless of whether we run in serial or MPI parallel.

Screenshots or Code Snippets

None, I have not looked at the source code yet.

Please complete the following information regarding your system:

  • OS: kubuntu linux 18.04-LTS
  • MOAB Version: 5.1.0
  • Physics codes versions installed with: MCNP 6.2

Additional Context

All other tallies (including mcnp fmesh structured mesh) produce the same value regardless of whether run in serial or MPI.

bohmt commented

Note I believe the dagmc tetmesh tally values calculated with the serial version are correct as they are consistent with values calculated using other tally types in the same calculation.

Is it off by a consistent factor everywhere? Is it always 5.5? Is the volume of your elements about 5.5 cm3?

bohmt commented

In the test calculation, the mesh is created on a cube of 1x1x1 cm and has 12 elements with an average size of 0.0833 cm3
mbsize peakfluxmesh.h5m
File peakfluxmesh.h5m:
type count total minimum average rms maximum std.dev.


Edge 18 20 1 1.1381 1.1547 1.4142 0.19526
Tri 12 6 0.5 0.5 0.5 0.5 0
Tet 12 1 0.079293 0.083333 0.083387 0.087373 0.0029907
1D Side 108 1.1e+02 0.83696 1.0475 1.0673 1.4142 0.20476
Vertex 9

The tally values are consistently about 5.5x higher regardless of mesh and model geometry. I've tested with a variety of models (e.g. ITER and FNSF) and various meshes and they always seem to be about 5.5x higher.

bohmt commented

Some added information:
In the test model, the tally value ratios are slightly different for each mesh element:
MPI run ___ Serial run_____________________Ratio
SCALARS TALLY_TAG double 1 SCALARS TALLY_TAG double 1
LOOKUP_TABLE default LOOKUP_TABLE default
0.006674852907 | 0.001211056133 ratio is: 5.51159663464
0.006587786212 | 0.001166212084 ratio is: 5.64887493654
0.006290317227 | 0.001144181731 ratio is: 5.49765571025
0.006226484829 | 0.001143657472 ratio is: 5.44436160428
0.006278976747 | 0.001150050025 ratio is: 5.45974228121
0.006428723465 | 0.001159444724 ratio is: 5.54465713796
0.00665024733 | 0.001147529048 ratio is: 5.79527580726
0.006576258655 | 0.001167835865 ratio is: 5.6311497635
0.006282803883 | 0.001130000312 ratio is: 5.56000190113
0.006544632233 | 0.001194325651 ratio is: 5.47977197636
0.005964523802 | 0.001109639542 ratio is: 5.37519038953
0.005953604226 | 0.001092817195 ratio is: 5.44794157087

You could try creating a geometry where the flux is coming from a mono directional plane source, void material everywhere. Using that mesh (assuming it looks like this)

image

Then we would expect the flux to be (basically) identical in each element, and should have the value given by the average chord length for the element. I can't think of a good reason why this would happen to be honest I'm stumped.

I would suggest we should add an MPI Tally test, but I'm not sure it would replicate the right conditions as we would see in MCNP. Do you see the same behaviour with regular cell based tallies or with cartesian mesh tallies?

bohmt commented

This ~5.5x higher behavior does not happen with any other tallies in MCNP such as the surface or cell based tallies or the fmesh structured mesh tallies. (I put this information in the original issue post under the "Additional Context" heading so you may have missed it).