MODA-NYC/db-recovery-data-partnership

Cuebiq: filter cuebiq_weekly_homeswitcher.csv to NYC metro

Closed this issue · 1 comments

We received a message from a user from DCP pointing out that Cuebiq HomeSwitcher data has not been updated since 1/10/2021. It appears that Github actions have been successfully running daily, but just creating new folders with version .txt files.

Please take a look at the S3 folder where we receive the HomeSwitcher data from Cuebiq (s3://cuebiq-dataset-nv/offline-intelligence/index=relocation/country=US/) to see if there is data there beyond 1/10. If not, let me know and I can reach out to Cuebiq. If there is data, please start looking into what issues there may be with the data pipeline.

Cuebiq HomeSwitcher data on Sharepoint
Cuebiq recipes

@SteveScott To reduce the size of cuebiq_weekly_homeswitcher.csv, please edit the data pipeline to filter the file to rows where either the origin or destination county is within the NYC metro.

I've attached a file below with the FIPS codes of each county in the New York–Newark, NY–NJ–CT–PA Combined Statistical Area. I'm suggesting that we filter for rows that include one of these FIPS codes in either new_fips_county or old_fips_county.
NYC CSA.xlsx