saezlab/OmnipathR

Link rot causes OmnipathR functions to stop working

slowkow opened this issue · 1 comments

A third party changed their URL, and this causes OmnipathR to throw an error:

library(OmnipathR)
networks <- nichenet_networks()
[2022-09-21 15:17:56] [SUCCESS] [OmnipathR] Building NicheNet network knowledge
[2022-09-21 15:17:56] [SUCCESS] [OmnipathR] Starting to build NicheNet signaling network
[2022-09-21 15:18:02] [SUCCESS] [OmnipathR] Downloaded 77073 interactions.
[2022-09-21 15:18:13] [SUCCESS] [OmnipathR] Harmonizome (maayanlab.cloud): downloaded 6013 records
[2022-09-21 15:18:13] [SUCCESS] [OmnipathR] Harmonizome (maayanlab.cloud): downloaded 12161 records
[2022-09-21 15:18:13] [SUCCESS] [OmnipathR] Harmonizome (maayanlab.cloud): downloaded 819 records
[2022-09-21 15:18:14] [WARN]    [OmnipathR] Failed to download `https://stke.sciencemag.org/content/sigtrans/suppl/2011/09/01/4.189.rs8.DC1/4_rs8_Tables_S1_S2_and_S6.zip` (attempt 1/3); error: HTTP error 503.
[2022-09-21 15:18:16] [WARN]    [OmnipathR] Failed to download `https://stke.sciencemag.org/content/sigtrans/suppl/2011/09/01/4.189.rs8.DC1/4_rs8_Tables_S1_S2_and_S6.zip` (attempt 2/3); error: HTTP error 503.
[2022-09-21 15:18:17] [ERROR]   [OmnipathR] Failed to download `https://stke.sciencemag.org/content/sigtrans/suppl/2011/09/01/4.189.rs8.DC1/4_rs8_Tables_S1_S2_and_S6.zip` (attempt 3/3); error: HTTP error 503.
Error in download_base(url = url, fun = curl_download, destfile = version$path,  :
  Failed to download `https://stke.sciencemag.org/content/sigtrans/suppl/2011/09/01/4.189.rs8.DC1/4_rs8_Tables_S1_S2_and_S6.zip` (attempt 3/3); error: HTTP error 503.

This file seems to have the broken link:

"vinayagam": [
"https://www.science.org/action/downloadSupplement?",
"doi=10.1126%%2Fscisignal.2001699&",
"file=4_rs8_tables_s1_s2_and_s6.zip"
],

The URL in the urls.json file seems to take us to this old URL that does not work anymore:

https://stke.sciencemag.org/content/sigtrans/suppl/2011/09/01/4.189.rs8.DC1/4_rs8_Tables_S1_S2_and_S6.zip

Instead, it should be taking us to this new URL that does work on September 21, 2022:

https://www.science.org/doi/suppl/10.1126/scisignal.2001699/suppl_file/4_rs8_tables_s1_s2_and_s6.zip

Each time a third party makes changes to their URLs, we can expect that some OmnipathR function is going to be broken.

Could you please consider downloading each dependency by yourself, and then uploading each one to a permanent repository (e.g. Zenodo) for future researchers? Permanent archival would ensure that your package does not throw errors the next time some third party changes the URLs.

According to Wikipedia:

A 2003 study found that on the Web, about one link out of every 200 broke each week,[1] suggesting a half-life of 138 weeks.

Thank you for the tremendous contribution to the research community! It's truly magical to have one central package for accessing so many sources of information. It would be a shame to have all of this effort wasted due to link rot.

Thanks a lot Kamil, very helpful! And this is a very relevant and ubiquitous issue. Also thanks for checking our main readme. We will try to use your tool for automated checks, will think about how to archive somehow the static resources, and of course fix the broken stuff that you've just found.

Best,

Denes