WIP: This is still in development, but should be relatively stable by now.
This pipeline uses featureCounts on cancer datasets from ICGC and generates count matrices, similar to what nf-core/RNAseq does. Users can specify a ICGC Manifest file with object ids, which will then be converted to encrypted S3 URLs. The pipeline then uses the provided GTF file to generate count matrices for all files in the manifest.
The pipeline is built using Nextflow, a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It comes with docker / singularity containers making installation trivial and results highly reproducible.
The ICGC-FeatureCounts pipeline comes with documentation about the pipeline, found in the docs/
directory:
- Installation
- Pipeline configuration
- Running the pipeline
- Output and how to interpret the results
- Troubleshooting
This pipeline was written by Alexander Peltzer (apeltzer) at QBiC.