[bug] rare issue with planar surface binding potential in Mesh2D case
drobnyjt opened this issue · 2 comments
Description
A handful of NaNs were produced using the following mesh with large numbers of computational ions when ions initially at the top of the mesh hit the bottom of the first trench:
simulation_boundary_points = np.array([
[-0.1, -0.1],
[4.1, -0.1],
[4.1, 10.1],
[-0.1, 10.1],
[-0.1, -0.1]
])
vertices = np.array([
[0.0, 0.0],
[1.0, 0.0],
[2.0, 0.0],
[3.0, 0.0],
[4.0, 0.0],
[4.0, 1.0],
[3.0, 1.0],
[3.0, 10.0],
[2.0, 10.0],
[2.0, 1.0],
[1.0, 1.0],
[1.0, 10.0],
[0.0, 10.0],
[0.0, 1.0]
])
triangles = [
[13, 12, 11],
[13, 11, 10],
[0, 13, 10],
[0, 10, 1],
[1, 10, 9],
[1, 9, 2],
[9, 8, 7],
[9, 7, 6],
[2, 9, 6],
[2, 6, 3],
[3, 6, 5],
[3, 5, 4],
]
boundary = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
I suspect there is a divide by zero error in a trig function produced by an edge case of recoil sampling. The error is not present for an isotropic surface binding potential.
I believe I understand the cause of this issue.
When using a planar surface binding potential, surface refraction in RustBCA is calculated when a particle passes through the energy barrier of a surface (typically located 1 mean free path above the surface). Whether this happens is determined by querying if the current position is inside or outside the energy barrier, and if the previous position was inside or outside the energy barrier. In the case of (true, false), the particle is entering the energy barrier. In the case of (false, true) the particle is leaving the energy barrier. In any other case, surface refraction does not happen.
In the calculation of surface refraction, RustBCA makes a call to the geometry module to query the closest point on the material boundary to the particle's current position. This is used to calculate the direction of the normal vector of the nearest edge or face of the material, since the closest point of a line to a point not on that line will lie along the normal vector that intersects with that point. That normal vector is +/- (x_closest - x, y_closest - y).
When using the geometry routines based on geo and when closest_point is queried for points inside of the mesh, instead of returning the nearest point on the edge of the mesh, geo instead returns the current point as an Intersection. This is unexpected.
Fortunately, under perfectly controlled circumstances, this should never happen because the energy barrier thickness is typically 1 mean free path, so particles will never jump completely over it; this at least partially explains how this bug has existed for so long without being noticed. However, in cases where the energy barrier thickness is less than one mean free path (in particular, Mesh2D requires specification of an energy barrier thickness that can have any value), a particle can skip over the energy barrier and enter the material, causing the aforementioned issue with closest_point and creating a normal vector of (0, 0). There are also older versions of the BCA where the energy barrier and the material surface are collocated for which this issue would also appear. When this happens, the normal vector becomes (0, 0). When the normal vector has zero-magnitude, the calculation of surface refraction introduces NaNs into the particle's new direction, which are not caught at that point and instead propagate throughout the code until either 1) it crashes somewhere that checks for NaNs or 2) it produces nonsense output values, including NaNs or worse.
A temporary solution would be to use the particle's previous location as the query point for closest_point, but that is not necessarily guaranteed to be out of the material either - for example, in the case of a high energy ion passing through extremely thin fins of low-density material that might happen quite often, especially if a gaseous mean free path model is used. Another would be to simply ensure that energy barrier thicknesses < 1 mfp are never used, but again there are edge cases where this is not sufficient to prevent the issue.
There is another issue, #222, involving switching from geo to parry2D for Mesh2D, but I do not know if that will solve the problem - parry may make the same choice about closest points and 2D shapes as geo.
A more correct solution would be to calculate the normal vector used for surface refraction more carefully; this would probably require adding a nearest normal vector calculation to the Geometry trait and implementing it for all currently available geometry types.
The parry2d switch is working well on the branch linked to the above pull request.
Pending changes on that branch:
- for small meshes, performance is slightly worse than my hand-coded trimesh+geo; performance should be quantified as a number of triangles to make sure it has better than O(N) performance as expected
- switch to nearest_normal_vector should be finalized