HTTPArchive/data-pipeline

Blink tables contain 20220512 data and not 20220501

tunetheweb opened this issue · 1 comments

Run these:

SELECT yyyymmdd, COUNT(0) FROM `httparchive.blink_features.usage` WHERE yyyymmdd >= '20220401' GROUP BY yyyymmdd
SELECT yyyymmdd, COUNT(0) FROM `httparchive.blink_features.features` WHERE yyyymmdd >= '2022-04-01' GROUP BY yyyymmdd

Need to clean up at some point.

Looks like the blink time series reports have not run yet as I capped at 20220501 when rerunning them earlier. HTTPArchive/bigquery#173 should also prevent that date being included but will mean 20220501 is skipped if 20220601 is run so need to fix these tables before June run.

Resolved.

Reinserted from 20220501 tables and deleted the 20220512 data.