As a user looking at a single collection session, I would like to see the appropriate amount of data points depending on the zoom level I've selected
kwonangela7 opened this issue · 6 comments
Description
The user would like to see more of the individual data points when they zoom into the map, and less of the individual data points when they zoom out.
Strategy
Problem
The amount of data points may be overwhelming for the user to look at. We are interested in only showing the pollutant reading with the highest value, within every n
points. So, if there are 3000 data points, we would split them into buckets of 100 or so data points. The buckets are based on the timestamps when the readings were taken. The readings 0-99 go into the first bucket, 100-199 into the next, etc. But, the buckets would only have 100 points at the highest zoom level. At the next zoom level, there would be 50 readings in each bucket and we would only display the highest value from that bucket.
Background
The data are collected via two citizen scientists walking a route around a West Oakland neighborhood with a GPS device and a pollutant reader. This means there is strong correlation between the time when the reading was taken, and the geographic "order" of the points. By organizing based on "time", we get a cheap and fast way to organized by the correlated "geo".
Optimization Need
Finding the maximum data reading within each bucket is computationally expensive. Recalculating the data points on every zoom may cause disruptive loading behavior and undo stress on the browser. We can do better than this.
Hypothetically, the order in which the data points are added to the map after each zoom should be deterministic- we can calculate it once, store it in a data structure, and then perform O(1) lookups on that data structure. The data structure would preferably be an array (list), as the data layers are arrays. We need an algorithm to create this sorted array.
Sample data
API Swagger Docs
Caution: Computer may experience slowdown while loading
Example Collection of Data
Code Placement
It will be simpler to implement this using the frontend, rather than the api
- The API will not need to be updated
- We will not need to update the frontend to make new, more complex calls, api calls
- The computational effort is offloaded to the browsers, allow us to keep our server overhead low.
Acceptance Criteria
- The number of points displayed on the map increases as the user zooms in
- The number of points decreases as the user zooms out
- The maximum reading from each bucket of points is displayed
- At the upper limit of zooming in, all data points are shown
- At the upper limit of zooming out, enough data points are still shown to reasonably outline the general shape of the route
- Changing between zoom levels does not require a disruptive amount of loading
Related Issues
#315 It's important to understand #315 (n
is defined there). Based on the zoom level, the developer can pass in a different query parameter. For example, if the zoom level is 0%, then n
= 1 (all data points would be displayed)
I wonder if heatmapping is the way to go here, to show the general character of clusters of points without having to show each and every point. It's pretty neat that in this Mapbox example, Create a heatmap layer that you can actually zoom all the way down until the constituent points of the heatmap representation can be seen. Maybe this leads to a larger discussion of how people actually want to experience the data. Is resolving a single point in any zoom level necessary? If not, at what zoom level would someone want to query an individual point?
Let's think about the actual use case. For our current priority audience (volunteer data collectors), they are learning generally about the connection between emissions sources in the environment and the air pollution that results. So the representation of that particulate reading on the geopoint is kind of the bridging of that connection. I worry that a heat map might pull them out of the the learning mode of understanding those connections and refocus them more broadly on general readings.
While the heat map may be helpful for a general user, I wouldn't want to overlook the basics that need to happen for the volunteer learner.
Thoughts on this? I might be overthinking it.
This zoom thing has to be a solved problem already somewhere. 🤔
I haven't read this thoroughly, but is there a solution here?
https://docs.mapbox.com/help/troubleshooting/working-with-large-geojson-data/
Again, we're not the first ones to tackle this challenge so it seems like it's a matter of seeing how else this has been done?
Additional resource
Won't fix.