pllittle/UNMASC

UNMASC and GATK RGs

Opened this issue · 1 comments

Hi Paul,

I am having some issues with the UNMASC pipeline. I have some samples after preprocessing (AGeNT Trimmer, aligned to hg19, BQSR), duplicates marked by Picard UmiAwareMarkDuplicatesWithMateCigar, and UMI tags extracted and added to the end of the read ID (UMI_VARCAL).

The UNMASC workflow terminates with the following error:

NULL Warning message: In fread(dict_chrom_fn, sep = "\t", header = FALSE, data.table = FALSE) : Stopped early on line 95. Expected 3 fields but found 6. Consider fill=TRUE and comment.char=. First discarded non-empty line: <<@RG ID:@A00620:188:HVMYMDSX2:4:1101:6470:1016 LB:157722_lib PL:ILLUMINA SM:157722 PU:@A00620:188:HVMYMDSX2:4:1101:6470:1016>>

Here is the image.rds: https://drive.google.com/file/d/1_j8o6iaqWje1NE9Chw-keyIjOuYLePwF/view?usp=share_link

This looks to me like UNMASC does not expect the read group line to have six fields (RG, ID, LB, PL, SM, PU). However, this seems to be GATK standard and is obligatory for my preprocessing (esp. Picard).

I would greatly appreciate any help with this. Happy to provide any additional information. Thanks in advance.

Best wishes

Christian

Hi @christianbosselmann,

The issue seems to be with the dict_chrom_fn input file. Can you attach that file?

Best,
Paul