Backfilling stuck at 2023-05-11
paramono opened this issue · 6 comments
I am running reservoir-sync-node to backfill sales data. After a while, it stopped making new inserts, and the list of managers and workers which is usually output to console is now empty.
I inspected the sales table and found this:
SELECT COUNT(*) FROM sales;
returns17689647
, although here the docs state that there should be 76M+ salesSELECT MAX(created_at) FROM sales;
returns2023-05-11 08:21:36.488
SELECT MAX(updated_at) FROM sales;
returns2023-05-11 08:21:53.639
SELECT MAX(timestamp) FROM sales;
returns1680307199
, which corresponds to2023-03-31 23:59:59
I also checked what is stored in redis, hget sales backup
returns "{\"date\":\"2022-09-02\",\"managers\":[]}"
My current configuration
REDIS_URL set
USE_BACKUP=1
SYNC_SALES=1
SYNC_ASKS=0
CONTRACTS, WORKER_COUNT, MANAGER_COUNT and other vars are not set
Is there any reason why it stopped at this specific date? What can I do to backfill all the data?
@paramono Pushed some fixes. Caught some small errors that came up due to some API changes. Let me know if you are still facing these issues after pulling from latest and restarting.
@r3lays what should I do now to properly restart and test this? Should I manually insert some redis key? Or do you suggest restarting entirely from scratch?
Since you didn't get too far due to the issue - id suggest clearing your backup and restarting.
@paramono
@r3lays thank you, I think the fixes improved the situation.
I cleared up my backup and ran the sync node again. This time, it started gathering data from ~2018 and was gradually progressing towards the present date.
Looks like it gathered all the data it could and the table of managers and workers displayed in the console is now empty
However, sync node collected 39157486
sales.
Does that sound right? [The docs say]https://docs.reservoir.tools/docs/syncing-sales#parallel-data-processing) there should be 76M+ sales.
Is anything missing?
@r3lays also, the most recent sales remained the same:
SELECT MAX(created_at) FROM sales;
returns 2023-05-11 08:21:36.488
SELECT MAX(updated_at) FROM sales;
returns 2023-05-11 08:21:53.639
SELECT MAX(timestamp) FROM sales;
returns 1680307199, which corresponds to 2023-03-31 23:59:59
It should have collected until the present date, right?