SimpleTable: performance enhancement
TCL735 opened this issue · 3 comments
On a large data set, Simple Table still performs poorly. While rendering eventually happens, there is a long pause (longer than for a line graph as a comparison) and the data finally renders. But the actual rendering could be very small, just a handful of rows.
It should not take that long to render a subset of the data, as SimpleTable has built-in pagination. Something is clearly amiss with its internal implementation that causes a performance bottleneck.
Possible explanations:
- Sorting is happening on the entire data set, which can be n^2 or worse
- Unnecessary traversals through the entire data set are happening
Sorting on the entire data set does not appear to be happening. I scrambled a large data set and the rendering came back in the same order as the scrambled data.
However, there may still be many smaller sorts happening.
In the process of benchmarking performance criteria between when SimpleTable was in the UI versus in Giraffe, I noticed that there were certain use cases where the Giraffe version simply fell apart, despite the code being injected being a copy job of what existed within the UI. While there are some simple improvements that can be made to optimize SimpleTable, it appears as though there is an infrastructural layer that within Giraffe that adds latency in data visualization processing that was not previously a factor in rendering SimpleTable. This latency only became clear once linking the giraffe library to local development and being able to run performance analysis on the render cycle of SimpleTable versus how it compared to the render cycle of when SimpleTable lived within the UI.
Root Cause
The root cause for the latency with SimpleTable is the usePlotEnv hook that did not exist when SimpleTable was in the UI. Benchmarking performance run-throughs on larger and larger datasets indicate that the hook doesn't scale well and adds exponential latency w/r/t larger datasets. I've identified this as the contributing factor the drastic performance differences @garylfowler has been seeing with the SimpleTable port over from the UI to Giraffe.
Suggestions
While there are several ways to resolve this issue, it seems like the most straightforward one would be to remove the use of this hook in rendering SimpleTable (since it wasn't originally a factor in the UI code), thereby accurately reflecting the state of the visualization in the UI. A longer-term solution might be to identify why the identified hook is so slow, and improve the performance of it in order to shore up performance across the board for all visualization types.
For the most part, Simple Table performance is about as good as it gets. We have addressed the usePlotEnv
hook (it has been removed from Simple Table's code path), and also various major bugs.
Since this issue is about performance, I am closing it as done.
Additional work on Simple Table can be addressed in separate tickets.