cluster_samples fail when running mouse WES samples
ipstone opened this issue · 3 comments
it fails at
Started at Sat Jan 2 13:12:09 2021
Terminated at Sat Jan 2 13:12:15 2021
Results reported at Sat Jan 2 13:12:15 2021
Exited with exit code 1.
Resource usage summary:
CPU time : 10.76 sec.
Max Memory : 1 GB
Average Memory : 0.60 GB
Total Requested Memory : 16.00 GB
Delta Memory : 15.00 GB
Max Swap : - Max Processes : 4 Max Threads : 62
Run time : 6 sec. Turnaround time : 6 sec.
The output (if any) follows:
INFO 13:12:12,445 HelpFormatter - --------------------------------------------------------------------------------
INFO 13:12:12,447 HelpFormatter - The Genome Analysis Toolkit (GATK) v3.1-1-gcfc45fd, Compiled 2014/03/31 11:48:54
INFO 13:12:12,447 HelpFormatter - Copyright (c) 2010 The Broad Institute
INFO 13:12:12,447 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk
INFO 13:12:12,450 HelpFormatter - Program Args: -S LENIENT -T UnifiedGenotyper -nt 4 -R /home/ipstone/share/reference/Mus_musculus_GRCm38/Mus_musculus.GRCm38.71.dna.chromosome.genome.fa --dbsnp /home/ipstone/share/reference/mgp.v5.merged.snps_all.dbSNP142.vcf.gz -I bam/study-sample.bam -L /home/ipstone/share/reference/dbsnp_tseq_intersect.bed -o snp_vcf/study-sample.snps.vcf --output_mode EMIT_ALL_SITES
INFO 13:12:12,454 HelpFormatter - Executing as ipstone@lt13 on Linux 3.10.0-957.12.2.el7.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.7.0_45-b18.
INFO 13:12:12,454 HelpFormatter - Date/Time: 2021/01/02 13:12:12
INFO 13:12:12,454 HelpFormatter - --------------------------------------------------------------------------------
INFO 13:12:12,455 HelpFormatter - --------------------------------------------------------------------------------
INFO 13:12:13,015 GenomeAnalysisEngine - Strictness is LENIENT
INFO 13:12:13,091 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 250
INFO 13:12:13,098 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
INFO 13:12:13,145 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.05
INFO 13:12:14,961 GATKRunReport - Uploaded run statistics report to AWS S3
ERROR ------------------------------------------------------------------------------------------
ERROR A USER ERROR has occurred (version 3.1-1-gcfc45fd):
ERROR##### ERROR This means that one or more arguments or inputs in your command are incorrect.
ERROR The error message below tells you what is the problem.
It seems the cluster_samples target is using the wrong dbSNPs in its code:
These are probably human snps:
ifeq ($(EXOME),true)
DBSNP_SUBSET ?= $(HOME)/share/reference/dbsnp_137_exome.bed
else
DBSNP_SUBSET = $(HOME)/share/reference/dbsnp_tseq_intersect.bed
endif
Just modified the clusterSamples.mk
to
14 ifeq ($(EXOME),true)
~ 15 #DBSNP_SUBSET ?= $(HOME)/share/reference/dbsnp_137_exome.bed
- 16 DBSNP_SUBSET ?= $(HOME)/share/reference/mus_musculus_known_genes_exons_GRCm38_noheader.bed
- 17 # -- modified subset to the mouse exome region
Will test this out once current run is done.
this modification make the cluster_samples work properly.