Downloading the HMDA data
SantoshSrinivas79 opened this issue · 14 comments
Is there a way to create a torrent to allow downloading the data more effectively for offline analysis. I am trying to download the entire data available at The Home Mortgage Disclosure Act - Explore
The data is about 72.3 GB. There is absolutely no way of downloading it. The download times out and fails through a download manager too.
Please advise.
@fountainhead Thanks for your question. I can't speak for the @cfpb as to what their current plans definitely are, but I think that putting up the entire data set as static files on S3 or something like that is planned. I will check in and see what I can do to make that happen.
Thanks. That would help.
Any luck with this request?
We've found an issue with our deployment that is preventing large downloads that should be fixed soon.
The fix for this has been deployed, so feel free to give large downloads another try.
Doesn't work! The download stopped after 2GB for the 2012 data (about 12GB estimated!)
Can you guys please move the data to S3?
@cndreisbach Could you please reopen the issue? The download is still timing out.
Can you help host the data on a service that allows resumable downloads?
@fountainhead I just wanted to let you know we haven't forgotten about you. I'm preparing an export of the HMDA data by year to flat files, which we'll put on S3. I'll let you know when that is done.
For future issues directly with api.consumerfinance.gov, the cfpb/api repo is probably the best place to file issues.
Thanks @cndreisbach . Look forward to working the data. Please try to include zipped files in the download section and allow resuming.
Thank you.
@fountainhead You can find the last five years of data now, served through S3, so resuming will work fine. Here are the links:
http://files.consumerfinance.gov/hmda/hmda_lar-2012.csv.gz
http://files.consumerfinance.gov/hmda/hmda_lar-2011.csv.gz
http://files.consumerfinance.gov/hmda/hmda_lar-2010.csv.gz
http://files.consumerfinance.gov/hmda/hmda_lar-2009.csv.gz
http://files.consumerfinance.gov/hmda/hmda_lar-2008.csv.gz
And an MD5 checksum: http://files.consumerfinance.gov/hmda/hmda_lar.md5
@cndreisbach Would it be possible to upload the archives through 2013-2015 as above?
@SantoshSrinivas79 You can download a zip file for each individual year by going to http://www.consumerfinance.gov/data-research/hmda/explore, and in the "Filter" box, select the single year you wish to download. Then, toward the bottom of the screen, click the Download button. Repeat for all years in question.
There's logic under the hood that routes single-year, full-dataset requests directly to a zip file without querying from the API
@marcesher Thank you sir!
@marcesher Is there way to do the same for this dataset too? Consumer Complaint Database > Consumer Financial Protection Bureau
An s3 zip would be great!