gpertea/stringtie

One gene outputs multiple TPM values

HengkuanLi opened this issue · 3 comments

When quantified with stringtie, multiple lines were output from a gene. By looking at the gtf file, I found that these were different transcripts of the gene. How do I solve this?

ENSSSCG00000035639 TERB1 6 - 27450632 27503893 45.755924 12.578593 26.445555
ENSSSCG00000035639 TERB1 6 - 27511466 27540118 30.928654 13.596822 28.586306
ENSSSCG00000003981 ZFP69B 6 - 170660884 170675879 5.751518 1.388807 2.926265
ENSSSCG00000003981 ZFP69B 6 - 170687418 170701255 35.762066 8.635393 18.195074

I have the same problem.

LOC_Os10g31460 - Chr10 - 16494293 16495238 0.850951 0.184613 0.368143
LOC_Os10g31460 - Chr10 - 16486322 16492007 14.433779 6.385049 12.732656

me too

This happens when your annotation file contains genes with non-overlapping transcripts. We consider this to be an annotation error. In this case we suggest to complete your downstream analysis by consider this two gene locations as distinct: i.e. you could label ENSSSCG00000035639 TERB1 6 - 27450632 27503893 45.755924 12.578593 26.445555 as ENSSSCG00000035639_1 and ENSSSCG00000035639 TERB1 6 - 27511466 27540118 30.928654 13.596822 28.586306 as ENSSSCG00000035639_2. Alternatively (not prefered) you could just remove the location with the smaller TPM.