refresh-extdata action failing during NSSH parsing
brownag opened this issue · 1 comments
SKB weekly update in GH actions is failing to completely download one of the NSSH sections.
https://github.com/ncss-tech/SoilKnowledgeBase/runs/3401794971?check_suite_focus=true
Will re-try the scheduled action in a bit and hope that it resolves itself (as it usually does) b/c the link (https://directives.sc.egov.usda.gov/41514.wba) works fine.
trying URL 'https://directives.sc.egov.usda.gov/41514.wba'
Content type 'application/pdf; charset=utf-8' length 143062 bytes (139 KB)
==========================================
downloaded 117 KB
In addition: Warning messages:
1: In download.file(y$href, destfile = pat) :
downloaded length 120224 != reported length 143062
2: In download.file(y$href, destfile = pat) :
URL 'https://directives.sc.egov.usda.gov/OpenNonWebContent.aspx?content=41514.wba': status was 'Failure when receiving data from the peer'
Quitting from lines 44-64 (README.Rmd)
Ideas:
In near future I would like to be able to cache these datasets and decide whether a download is necessary by comparing hashes or something. However, as there aren't hashes published on eDirectives, not sure I can get around downloading at least some of the files.
Perhaps I will create a separate routine that caches and hashes infrequently updated things like taxonomy, the directives downloads, etc. and then they are only used/updated if different using something like targets to manage the dependencies of data -> product (#6)