Keep-Current/web-miner

Add daily-caching for the arxiv query service

liadmagen opened this issue · 0 comments

Arxiv.org updates the papers once a day. Multiple requests in a short period of time 'wastes' the arxiv server by re-querying data.

A solution for this would be daily-caching of the results, so that new requests won't need to re-reach the arxiv, and therefore will be faster to execute.

The caching should contain a date timestamp, it can be saved as a file on the disk (.json) which will be deleted when expired (after 24h), when a new request checks the file existence and deletes it if it's older than 24H. If the file is newer than that, its content should be read and sent back to the client.

This is a good issue for making the architecture decisions of where to place which code.