IARCbioinfo/SBG-CGC_course2018

Project 1: needlestack variant calling

tdelhomme opened this issue · 4 comments

Run needlestack on TCGA data.
Given a cohort, or a center, run on one gene, and then on a whole BED file with parallelization.

Project source code and documentation is hosted here.

Todo list:

  • Create docker file
  • Run needlestack without Nextflow: bash script needlestack.sh
  • Run needlestack on tumor-normal pairs: create a txt file containing the tumor normal pairs (use bam files metadata to retrieve TCGA barcodes)
  • Parallelization: create a bed file and a script to merge the vcf files

Maybe the needlestack dockerfile on dockerhub is ok for the bash version, need to ne checked.

We created a new docker file in needlestack/dev/bin based on the needlestack dockerfile adding wget of the R scripts dependencies and the hg19/38 chromosomeNames2UCSC.txt