oicr-gsi/debarcer

Add option to locate umi within reads

Closed this issue · 1 comments

It is assumed that the umi is at the start of the read. This is certainly correct for the currently supported library but there should be an option to specify the umi location within reads in efforts to generalize debarcer to other library preps.
Add an option in config/library_prep_types.ini to indicate umi position

New parameter UMI_POS in the library_prep_types.ini specifies the umi position within reads (1-based). It is assumed that the umi position is the same for each fastqs having umis, but can differ among input fastqs with umis. UMI_POS can be either a comma-separated list of positions or a single position. If a single position is entered but several fastqs have umis, the same position will be propagated to all fastqs. An error wil be raised if the length of the comma-separated list of positions doesn't match the length the of the comma-separated list of umi length. It is also assumed that the order of entires is the same between UMI_POS and UMI_LENS if multiple values are specified