error at "extracting meta data from VRanges"
Closed this issue · 5 comments
Hi,
I am testing SMuRF on a set of files I generated running the individual callers. I am getting an "Error in normalizeDoubleBracketSubscript". It seems that the expected data type is not there. Are there specific requirements for the input vcfs ?
Thanks
Gianfilippo
Below is my command line and the output
myresults = smurf(directory = "Variants_hg38_BWA_ensemble/Sample_G1700T_012",mode="combined",nthreads=20,output.dir="Variants_hg38_BWA_ensemble/Sample_G1700T_012",build="hg38",check.packages=T)
[1] "SMuRFv1.6 (3rd Oct 2019)"
[1] "Saving output files to: Variants_hg38_BWA_ensemble/Sample_G1700T_012"
Connection successful!
R is connected to the H2O cluster:
H2O cluster version: 3.26.0.2
H2O cluster version age: 2 months and 25 days
H2O cluster total nodes: 1
H2O cluster total memory: 26.63 GB
H2O cluster total cores: 20
H2O cluster allowed cores: 1
H2O cluster healthy: TRUE
H2O API Extensions: Amazon S3, XGBoost, Algos, AutoML, Core V3, Core V4
R Version: R version 3.5.0 (2018-04-23)
Accessing files:
Variants_hg38_BWA_ensemble/Sample_G1700T_012/mutect2.vcf.gz
Variants_hg38_BWA_ensemble/Sample_G1700T_012/freebayes.vcf.gz
Variants_hg38_BWA_ensemble/Sample_G1700T_012/varscan.vcf.gz
Variants_hg38_BWA_ensemble/Sample_G1700T_012/vardict.vcf.gz
[1] "Parsing step"
[1] "reading vcfs"
[1] "reading mutect2"
[1] "reading freebayes"
[1] "reading varscan"
[1] "reading vardict"
Time difference of 16.48991 secs
[1] "extracting calls passed by at least 1 caller"
Time difference of 0.82076 secs
[1] "extracting meta data from VRanges"
Error in normalizeDoubleBracketSubscript(i, x, exact = exact, allow.NA = TRUE, :
invalid [[ subscript type: NULL
Hi,
thanks.
I can see the sample names (tumor and normal) in each of the 4 files (I have Mutect2, VarSvan2, VarDict, freebayes). They all are from the same sample, but I can see each has a different name. I guess I have to fix that.
Also, VarScan used the whole file path as sample names.
And in my freebayes vcf I can see an extra column that you do not have in your sample freebayes vcf. How do I get rid of it ?
Thanks
Hi,
I just edited the freebayes vcf and made sure all samples names in the various vcfs are consistent (see below). I am still getting the exact same error.
Do you have any other thought ?
Thanks
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Sample_G1700T_012 Sample_G1700N_006
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Sample_G1700N_006 Sample_G1700T_012
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Sample_G1700T_012 Sample_G1700N_006
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Sample_G1700N_006 Sample_G1700T_012
Resolving error:
Error in normalizeDoubleBracketSubscript(i, x, exact = exact, allow.NA = TRUE, :
invalid [[ subscript type: NULL
Cause:
vcf sample names for tumour and normal files not detected automatically.
Solution:
Manually state your tumor file tag.
Example:
t.label='-T
t.label='tumor'
t.label='T_001'
t.label='T' #also works for you
Error message:
't.label for tumor sample is not unique, duplicated or missing'
myresults = smurf(directory = "Variants_hg38_BWA_ensemble/Sample_G1700T_012",
mode="combined",
t.label='T_012',
nthreads=20,
output.dir="Variants_hg38_BWA_ensemble/Sample_G1700T_012",
build="hg38",
check.packages=T)
Please download the latest patch SMuRF-v1.6.2. Thanks!
thanks!!
I will try this