icgc-argo/argo-docs

ICGC Legacy Data - TSV with legacy object ID mapping to current repositories

Opened this issue · 1 comments

Supporting documentation to help users of the ICGC 25K portal find relevant data references after the portal's shutdown later this month.

We have completed a data audit that provides mappings between ICGC legacy portal file object IDs and the location of these files in their current repositories (EGA, PDC, etc.). We want to update this mapping to a form that can be easily shared and consumed by researchers looking to find these files on those other repositories.

Detailed Description

Convert our mapping of ICGC 25K file objects from its current form into a shareable file with relevant data to researchers looking for this data. This should become a TSV file and contain the following data:

  • ICGC 25K Object ID
  • List of repositories where it can be found
  • Location in each repository
  • ICGC ARGO ID if it exists

Attached list of objects indicating location : EGA, PDC, GDC, SFTP.

Note for SFTP locations, place holder URL and SFTP directory location are being used.

object_locations.tsv.gz