pllittle/UNMASC

About the annotation

Closed this issue · 3 comments

Hi paul,
I am now running some testing samples. When doing VEP annotation, you only use --af and gnomad_AF, but in your paper and your code, you mentioned the ExAC_AF. Does --af mean 1000 genome AF right? So do I need to add both ExAC and 1000 AF when annotating the VCFs?
Another problem is I think it took quite a time to do the steps of strelka and annotation, can I filter out some low-quality or depth candidate mutations of strelka2, which can make VEP running more faster? For my testing case , I found 50k+ "PASS" somatic mutations of strelka

Thanks,
FN

Hi @hfl112,

You're correct, we used 1000 Genomes and ExAC for benchmarking because the two databases sometimes give inconsistent AFs for some loci. In the VEP code provided, "--af" corresponds to 1000 Genomes and "gnomad" is a larger database that includes ExAC, so hopefully its more comprehensive.

Strelka2 does indeed take some time to run and using multiple threads/cores does noticeable decrease the runtime (e.g 1 core vs 5 cores). For VEP annotation, the key is to simply take the unique union of loci called across a tumor's VCFs and then annotate that single VCF. Otherwise there is an underlying redundancy of annotating the same locus in different VCFs. I haven't provided the code for that intermediate step but I'll work on including that in the near future in a comprehensive script.

Hope this helps!

Many thanks~

Hi @pllittle,
I got some error when running prep_UNMASC_VCF

> FILTER = list(nDP = 2,tDP = 2,Qscore = 3)
>vcf = prep_UNMASC_VCF(fout,DAT,FILTER,target_fn,anno_fn,4)

Error in import_VCFs(DAT = DAT, FILTER = FILTER, ncores = ncores) :
  Issue with FILTER$nDP

I think it's the problem here

FILTER= list(nDP = 2,tDP = 2,Qscore = 3,
contigs = sprintf("chr%s",c(1:22,"X","Y")))
if( is.null(FILTER$nDP) || is.numeric(FILTER$nDP) )
stop("Issue with FILTER$nDP")
if( is.null(FILTER$tDP) || is.numeric(FILTER$tDP) )
stop("Issue with FILTER$tDP")
if( is.null(FILTER$Qscore) || is.numeric(FILTER$Qscore) )
stop("Issue with FILTER$Qscore")
if( is.null(FILTER$contigs) || is.character(FILTER$contigs) )
stop("Issue with FILTER$contigs")
Error: Issue with FILTER$nDP
``