Store historic booking data
Opened this issue · 2 comments
Right now, the index.json
is crawled once every night and re-deployed with GitHub actions:
https://github.com/blu3r4y/jku-room-search/actions/workflows/deploy-github-pages.yml
It would be valuable to also persistently store and query historic data. My first idea would be to change the action to not replace the entire history, but instead append the new file to /data
on the gh-pages
branch:
https://github.com/blu3r4y/jku-room-search/tree/gh-pages/data
This way, we could utilize GitHub as our storage with minimal changes to the workflow.
The index.json
is roughly 300 KB in size. Storing this for 365 days would still require ~ 110 MB in uncompressed form. Still acceptable for a repository and for GitHub pages (since the limit is 1 GB) but still we should think about removing old data at some point, or moving it to some cold storage somewhere.
Actually, index.json
already contains a lot of historic data and not only the data for future days. At the time of writing this, index.available
has entries for all dates between 10.01.2023 and 26.09.2024. This is even stated in index.range.start
and index.range.end
. Tough, those dates only seam to be filled with data from the current semester. Take April 2023 for example, where no lectures are registered for the dates within the index file although there definitely were courses back then ;). Starting from September however, all historic data until now seems to be stored.
My suggestion: restrict the date range of index.json
to the current semester. Then, on the first day of a new semester, we could rename it to something like index-WS2023.json
and begin filling index.json
with data from the new semester. This would keep the website working without changes and older historic data would then be available for future projects.