transitland/transitland-datastore

Pagination: keyset

irees opened this issue · 0 comments

irees commented

https://www.citusdata.com/blog/2016/03/30/five-ways-to-paginate/

When getting all SSPs for a particular operator or feed, very high OFFSET values can fail. For instance, Berlin offset 1,000,000 takes 20s, and higher values can take even longer:

transitland-4-5-0=> SELECT "current_schedule_stop_pairs".id FROM "current_schedule_stop_pairs" WHERE "current_schedule_stop_pairs"."operator_id" = 1661  ORDER BY "current_schedule_stop_pairs"."id" ASC LIMIT 10000 OFFSET 1000000;
Time: 20145.364 ms

Implement an alternative pagination method where we use the highest ID from the previous page. This is up to 100x faster:

transitland-4-5-0=> SELECT "current_schedule_stop_pairs".id FROM "current_schedule_stop_pairs" WHERE "current_schedule_stop_pairs"."operator_id" = 1661 AND current_schedule_stop_pairs.id >= 1706528300 ORDER BY "current_schedule_stop_pairs"."id" ASC LIMIT 10000;
Time: 297.223 ms

The query param can be sort_keymin=x or similar.