Provide Access to Latest TCGA Data
Closed this issue · 3 comments
The data available is preprocessed before the mid-2016 harmonisation, which uses a newer reference genome and different analytic algorithms. Also, the Genomic Data Commons data has had some substantial changes since last year. For example, four months ago, the mutation data was reprocessed to remove Oxo-G artefacts caused by the exome-seq kit.
This is a good comment and a worthwhile change. Unfortunately I expect it will take a while to be able to migrate the underlying downloader from RTCGAToolbox to the GenomicDataCommons library, assuming there will be a different set of quirks and inconsistencies that come up when integrating the data. @LiNk-NY have you tried out just using Sean's GenomicDataCommons library (I mean just basic use, not as the curatedTCGAData downloader)?
I haven't really tried to download a whole set of MultiAssayExperiment datasets but I can see what the package has to offer.
This issue is outside the scope of the package as it is currently. We'd have to write an interface to GenomicDataCommons
and avoid conflating the two data sources.