pllittle/UNMASC

INPUT file error

Closed this issue · 2 comments

Hi, thanks for creating this incredible tool for users using unmatched normal data

Due to my own situation, I couldn't use the tumor_only.sh function you made for user's convenience.
Instead, I manually used Strelka2 + VEP to make the INPUT files.

Unfortunately, I facing several errors when I use the file I created (Attached many my input files vcf, bed_centromere_fn,dict_chrom_fn in UNMASC::run_UNMASC and target_fn, anno_fn in UNMASC::prep_UNMASC_VCF)

After many debugging trials, current error is

% ------------------------------- %
% Welcome to the UNMASC workflow! %
% ------------------------------- %
Thu Jun 23 23:38:00 2022: Import image ...
Thu Jun 23 23:38:29 2022: Finding oxoG artifacts ...
Thu Jun 23 23:38:29 2022: Merge strand info ...
Error in [.data.frame(strand, , c("mutID", "Chr", "Position", "Ref", :
undefined columns selected

Would you please look through the input files that I attached and see what's causing the problem.
If you have difficulties, could you please provide me with your example input files so that I can compare them with mine

Thanks in advance

Jiyun

UNMASC.zip

DAT

FILENAME STUDYNUMBER
1 /home/jyhong/Project/BRCA/WES/Strelka2/VCF/cjs.51A.high_SRR15195393_vep_somatic.PASS.vcf SRR15195393
2 /home/jyhong/Project/BRCA/WES/Strelka2/VCF/cjs.51A.high_SRR15195395_vep_somatic.PASS.vcf SRR15195395
3 /home/jyhong/Project/BRCA/WES/Strelka2/VCF/cjs.51A.high_SRR15195397_vep_somatic.PASS.vcf SRR15195397
4 /home/jyhong/Project/BRCA/WES/Strelka2/VCF/cjs.51A.high_SRR15195399_vep_somatic.PASS.vcf SRR15195399
5 /home/jyhong/Project/BRCA/WES/Strelka2/VCF/cjs.51A.high_SRR15195401_vep_somatic.PASS.vcf SRR15195401
6 /home/jyhong/Project/BRCA/WES/Strelka2/VCF/cjs.51A.high_SRR15195404_vep_somatic.PASS.vcf SRR15195404
7 /home/jyhong/Project/BRCA/WES/Strelka2/VCF/cjs.51A.high_SRR15195406_vep_somatic.PASS.vcf SRR15195406
8 /home/jyhong/Project/BRCA/WES/Strelka2/VCF/cjs.51A.high_SRR15195408_vep_somatic.PASS.vcf SRR15195408
9 /home/jyhong/Project/BRCA/WES/Strelka2/VCF/cjs.51A.high_SRR15195410_vep_somatic.PASS.vcf SRR15195410
10 /home/jyhong/Project/BRCA/WES/Strelka2/VCF/cjs.51A.high_SRR15195413_vep_somatic.PASS.vcf SRR15195413

UNMASC::prep_UNMASC_VCF

cjs.51A.high.vcf <- UNMASC::prep_UNMASC_VCF(
outdir = outdir,
DAT = DAT,
anno_fn = anno_fn,
target_fn = target_fn,
FILTER = NULL,
nlines = 100,
ncores = 1)

UNMASC::run_UNMASC

UNMASC::run_UNMASC(
tumorID = "cjs.51A.high",
outdir = outdir,
vcf = cjs.51A.high.vcf,
bed_centromere_fn = new_cent,
tBAM_fn = tBAM,
dict_chrom_fn = chr_length,
rd_thres = 10,
ad_thres = 5,
qscore_thres = 30,
minBQ = 13,
minMQ = 40,
exac_thres = 0.005,
gender = NA,
hg = "38",
binom = T)

Hi @pigyun906,

Thank you for interest in UNMASC and apologizes for the delay. I've pushed a commit of a script where I'm double checking the first step. Everything looks ok there. One comment I have is that your cjs.51A.high_allvar.vcf file could be more compact if you drop the NORMAL, TUMOR, and drop the normal-specific information from within the INFO column (e.g. DP=154;MQ=60.00;MQ0=0;NT=ref;QSS=206;QSS_NT=3070;ReadPosRankSum=-1.29;SGT=CC->CT;SNVSB=0.00;SOMATIC;TQSS=1;TQSS_NT=1) but the code works out perfectly fine.

The bug you presented could be generated from running the run_UNMASC() function. Unfortunately I can't debug this step without the tumor bam file. Perhaps the step to extract strand-specific information per locus was corrupted in image.rds. Take a look at the strand object in the image.rds to see if its NULL or not a dataframe. One solution is to delete the image.rds and start fresh re-running the two UNMASC functions.

Hope this helps!

I re-processed my input files according to your suggestion. So, successfully completed.

Thank you, pllittle