JuliaPhysics/SolidStateDetectors.jl

Allocations in `pt in CSG` for large CSG's

lmh91 opened this issue · 1 comments

lmh91 commented

When improving the application of boundary conditions (#206) I recognized that
in(pt::AbstractCoordinatePoint, g::AbstractGeometry) is not allocation free anymore
if the geometry consists out of too many (>4) primitives.

E.g., in case of the geometry defining the mantle contact of the IVC example detector:

using SolidStateDetectors, BenchmarkTools
sim = Simulation(SSD_examples[:InvertedCoax]);
g = sim.detector.contacts[2].geometry; # consists out of 6 primitives
pt = zero(CartesianPoint{Float32});

@btime in($pt, $g); # -> Not allocation free
  148.439 ns (6 allocations: 608 bytes)

For geometries with 4 or less primitives no allocations are caused:

gAllocFree = g.a.a; # consists out of 4 primitives
@btime in($pt, $gAllocFree); 
  58.282 ns (0 allocations: 0 bytes)

As one can see this also worsens the performance.
This causes also some major amount of allocations in the application of boundary conditions (#85).
It will also be an issue in the charge drift where this function is also used quite a lot of times.

I am not sure whether 4 -> 5 is always the edge. It might also depend on the types of the primitives.

In addition, the number of allocations increases for increasing number of primitives of the geometry:

@btime in($pt, $(g.a)); # -> Only 3 allocations
105.006 ns (3 allocations: 272 bytes)

The excluded individual primitives (g -> gAllocFree) are allocations free:

@btime in($pt, $(g.a.b));
  11.071 ns (0 allocations: 0 bytes)

and

@btime in($pt, $(g.b));
  11.371 ns (0 allocations: 0 bytes)

Julia has a limit of complexity for Union optimizations ... makes sense that it'll fall back to heap allocation of the individual objects if the number of different primitives is high - with a large mem-alloc/performance impact.

Not sure what we can do about this ... maybe we could try to have even fewer and more general/flexible primitives, or build unions manually (that would be kinda of a dirty trick).

Another thing we could think about is separating the CSG operations, since there's a very finite number of them, store all primitives in a flat list and store the CSG tree in flattened form in a list as well, referencing the primitives via integer indices.