BlockchainCommons/spotbit

Backfilling for fixing gaps in data

watersnake1 opened this issue · 1 comments

When spotbit has downtime, it is not receiving data from any exchanges. If downtime is significant this can lead to less than optimal datasets. Adding a system to automatically detect large gaps in a dataset and fill them in would be helpful.

There are already some functions in the spotbit source (server.py) that will be helpful for this.
They are:
find_gaps:
Find gaps in an exchanges database back to historyEnd and create a list of those gaps as tuples
request_history:
Fetch the complete historical data for an exchange for a given time interval in milliseconds
start_date is the oldest date
end_date is the newest date
backfill:
Makes calls to request_history for the periods of time that are missing, which are determined from find_gaps.

These methods can work together to find and download missing bits of historical data, then insert them into the database. The challenge is finding a way to get these to run in the background, periodically, in a way that does not interrupt normal functioning of the system.

The current refactor of Spotbit doesn't use a database but fetches directly from configured exchanges. See https://spotbit.info/docs for the updated API.