Performance improvements 🚀

Question

Performance improvements 🚀

juliohm opened this issue a year ago · 15 comments

This issue tracks our efforts to improve the performance of different algorithms and data structures implemented in the project. The public API is stabilizing and there is only one major breaking change planned before a possible v1.0.

Given the solid design that separates user-facing functions from internals, we can easily refactor the code without breaking code downstream. We kindly ask the community to add comments here whenever they encounter a performance problem.

Answer 1 · 2023-07-03T12:28:15.000Z

Copying from Zulip:

One slightly breaking change I would consider is changing the vararg constructors of geometries to use static arrays (possibly mutable). This would make it much easier to write performant code that handles a lot of small geometries (e.g. segments or triangles) and avoid code such as Segment(view(v, [i, i + 1])) that currently exists (which in this case allocates the index vector plus possibly the view, afaik). I think I also have a general expectation that the arguments of a constructor are stored within the struct and not allocated, so having Triangle(a, b, c) allocate [a, b, c] is rather surprising to me and having expressions such as centroid(Segment(a, b)) and area(Triangle(a, b, c)) perform well would be very desirable imo. The only drawback that I see is that the number of points becomes fixed, which is arguably a good thing for Ngon but maybe not in the case of Chain/Bezier (though constructing them with a regular vector would still be possible of course).

comment by @mfsch

Answer 2 · 2023-07-03T12:29:46.000Z

That is something we have in mind. In the past we had considered StaticVector for the list of vertices inside some of the Polytope types, but refactored and erased this. We may need to reintroduce this optimization and use NTuple directly whenever possible. Some of our Polytope types like PolyArea and Chain's subtypes require dynamic vectors due to the use cases.

Answer 3 · 2023-07-07T17:48:20.000Z

Related to this, I found another potential bottleneck: segments() function for chains is slower by about a factor of 6 than the "manual" version with StaticArrays:
This small change had an improvement from 30 to 5 seconds in my case:

# @views for seg in segments(chain)
@views for (p1, p2) in zip(verts[1:n-1], verts[2:n])
        seg = Segment(SA[p1, p2])
       .... do stuff ....
end

Answer 4 · 2023-07-07T18:21:52.000Z

Thanks we are in the process of rewriting some internals in terms of tuples and static vectors. Also adding more threads to speed things up. If you have performance suggestions please submit PRs in places you already have working code. Em sex., 7 de jul. de 2023 14:48, Jonas ***@***.***> escreveu:

…

Related to this, I found another potential bottleneck: segments() function for chains is slower by about a factor of 6 than the "manual" version with StaticArrays: This small change had an improvement from 30 to 5 seconds in my case: # @views for seg in segments(chain) @views for (p1, p2) in zip(verts[1:n-1], verts[2:n]) seg = Segment(SA[p1, p2]) .... do stuff .... end — Reply to this email directly, view it on GitHub <#488 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAZQW3K4V47AS35HIUA2NDTXPBDW5ANCNFSM6AAAAAAZYOI6CI> . You are receiving this because you authored the thread.Message ID: ***@***.***>

Answer 5 · 2023-07-07T19:01:58.000Z

Switching Segment(view(v, [i, i + 1])) to Segment(SVector(v[i], v[i+1])) in the segments function should correspond to what @BloodWorkXGaming did, but as I wrote I think it would be more useful to make Segment(a, b) expand to Segment(SVector(a, b)) (and similarly for all n-gons), instead of changing individual uses of Segment. If you prefer not to change the varargs constructor of Ngon, then it might make sense to start improving the performance of individual use cases, but I think it would be helpful to first make that more general decision.

Answer 6 · 2023-07-07T19:14:08.000Z

Another point of performance improvement might be the boundingbox implementations:
These have a lot of allocations, which shouldn't be necessary for any boundingbox calculation.
E.g. for this shape it's over 17kb of RAM

julia> poly
2 GeometrySet{2,Float64}
  └─PolyArea(50-Ring)
  └─PolyArea(356-Ring, 302-Ring)

julia> @time boundingbox(poly)
  0.000021 seconds (21 allocations: 17.172 KiB)
Box{2, Float64}(Point(0.604086399, -13.3619730947), Point(149.4063610679, 13.8699076501))

This isn't really a priority as it is still fast, but could be a problem when running in a tight loop.

Answer 7 · 2023-07-07T19:22:37.000Z

Fully agree with all comments raised. We will be working on these internal improvements soon as we are using the code for some agriculture applications that are demanding more performance. Em sex., 7 de jul. de 2023 16:14, Jonas ***@***.***> escreveu:

…

Another point of performance improvement might be boundingbox implementations: These have a lot of allocations, which shouldn't be necessary for any boundingbox calculation. E.g. for this shape it's over 17kb of RAM julia> poly 2 GeometrySet{2,Float64} └─PolyArea(50-Ring) └─PolyArea(356-Ring, 302-Ring) julia> @time boundingbox(poly) 0.000021 seconds (21 allocations: 17.172 KiB) Box{2, Float64}(Point(0.604086399, -13.3619730947), Point(149.4063610679, 13.8699076501)) This isn't really a priority as it is still fast, but could be a problem when running in a tight loop. — Reply to this email directly, view it on GitHub <#488 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAZQW3KBKN6OMNPMLFI7HULXPBNYXANCNFSM6AAAAAAZYOI6CI> . You are receiving this because you authored the thread.Message ID: ***@***.***>

Answer 8 · 2023-07-07T21:37:04.000Z

@BloodWorkXGaming regarding the boundingbox allocations, did you identify the harmful lines? Feel free to submit PRs for review. If it is also related to the current internal representation of vertices in Polytope types, we will fix it in a more systematic way.

Answer 9 · 2023-07-08T07:31:02.000Z

Ye I found some harmful lines, it is rather related to the MVectors and collect operations in the functions. Will try to open a PR after the weekend with some improvements :)

Answer 10 · 2023-07-17T16:25:30.000Z

Update: we already implemented a series of changes to the vertices representation inside Polytope geometries. They are NTuple of Point now, which means, more compile-time optimizations.

Answer 11 · 2023-07-21T12:15:12.000Z

Update: refactored boundingbox and hull methods to allocate zero memory whenever possible.

@BloodWorkXGaming all cases covered now from your previous PR.

Answer 12 · 2023-07-21T22:57:38.000Z

Thanks! Sorry for not refactoring the PR, was very low on time the past weeks

Answer 13 · 2023-07-26T00:14:35.000Z

Update: pointwise transforms no longer allocate intermediate vectors. Rotate, Translate and Stretch are examples.

Answer 14 · 2024-02-01T17:06:50.000Z

We need to improve the performance of the FIST triangulation method. I think it is already type stable, we need algorithmic improvements and multiple threads now.