mortazavilab/TALON

Duplicate transcripts - forward and reverse.

fergsc opened this issue · 3 comments

Hi,
We are doing lots of RNA with ONT sequencing. The sequencing method we are using results in both a forward and reverse read being sequenced for many RNA molecules. This results in many duplicate transcripts - one forward and one reverse (see figure). I have tried to solve this problem by using minimap2's -u parameter without success. Is there any way within Talon to deal with this issue?

Thanks.

-u CHAR	How to find canonical splicing sites GT-AG 
    - f: transcript strand
    - b: both strands
    - n: no attempt to match GT-AG [n]

Screenshot from 2023-01-30 17-16-32

The reverse transcripts should be labelled as "Antisense" by TALON, and I don't recommend analyzing them at all. I would filter them out.

Thanks for the recommendation, however when I checked antisense made up ~60% of reads.

Found a better solution for this problem- Pychopper https://github.com/epi2me-labs/pychopper

Pychopper v2 is a tool to identify, orient and trim full-length Nanopore cDNA reads. The tool is also able to rescue fused reads.

Ah, yes. TALON expects reads to be oriented in the 5'->3' direction so this should solve that.