scverse/napari-spatialdata

Speeding up napari-spatialdata

Opened this issue · 4 comments

Hey everyone! I just wanted to make an issue to discuss ways we can try to improve the performance of napari-spatial data
for large data. I've listed a few ideas/suggestions off the top of my head below to get things started.

  • [easy] Profiling loading of data. I think we need to do some profilng to figure out exactly where the bottlenecks are
  • [easy] The initial drawing of Shapes is slow because they need to be meshed (i.e., turned into triangles). Last year there was a fix that improved triangulation performance that requires the triangle library (see napari/napari#3867 - ~100x speedup for large numbers of shapes). We can include triangle as a dependency and it should "just work".
  • [medium] I am guessing generating the colormaps for the points and shapes is somewhat slow because we have to loop over the values (e.g., cluster, expression) to set the color. We could try numba to speed that up.
  • [medium] switch back to using points for the circles Shapes: #37 (We should reach out to Lorenzo Gaifas - he was working on points sizing recently)
  • [hard] async is coming along in napari. We can start transitioning to async and tiled rendering(e.g., for large 2D images), which should speed somethings up. Joel Lüthi is probably the best person to talk to about this.

Points:

  • [medium] stop using AnnData for storing the information to show the annotation. It's easy to do but quite some work because it requires a big refactoring. #53
  • [medium] load the data for annotations on demand from the user. As above, easy to do but requires some preliminary refactoring. #54

Polygons:

  • [medium] better extraction of data from the GeoDataFrame and better transfer of this data to self._viewer.add_shapes() #56
  • [medium] #57

Hey everyone! I just wanted to make an issue to discuss ways we can try to improve the performance of napari-spatial data for large data. I've listed a few ideas/suggestions off the top of my head below to get things started.

  • [easy] Profiling loading of data. I think we need to do some profilng to figure out exactly where the bottlenecks are
  • [easy] The initial drawing of Shapes is slow because they need to be meshed (i.e., turned into triangles). Last year there was a fix that improved triangulation performance that requires the triangle library (see Faster 2D shape layer creation napari/napari#3867 - ~100x speedup for large numbers of shapes). We can include triangle as a dependency and it should "just work".
  • [medium] I am guessing generating the colormaps for the points and shapes is somewhat slow because we have to loop over the values (e.g., cluster, expression) to set the color. We could try numba to speed that up.
  • [medium] switch back to using points for the circles Shapes: Use napari Points instead of napari Ellipses when SpatialData Shapes are actually circles #37 (We should reach out to Lorenzo Gaifas - he was working on points sizing recently)
  • [hard] async is coming along in napari. We can start transitioning to async and tiled rendering(e.g., for large 2D images), which should speed somethings up. Joel Lüthi is probably the best person to talk to about this.

Point regarding initial drawing being slow should be fixed when #5555 is merged. It avoids the drawing of the "filling" so that the triangles do not have to be calculated on the fly.

Tomorrow I will discuss exposing the rdp epsilon parameter in the community meeting. It should be ready for merge this week.

Cool, thanks @melonora !