ratschlab/spladder

Naming of skipped exons in mutually exclusive events is based on exon length not genomic coordinates

Opened this issue · 0 comments

  • spladder version: 3.0.3
  • Python version: 3.10.6
  • Operating System: Ubuntu

Thank you Andre for this very useful tool. I have a question about the naming of the skipped exons in mutually exclusive events and I'd appreciate your help please. The documentation says exon 2 is "first skipped exon (first defined by genomic coordinates)" and exon 3 is "second skipped exon (second defined by genomic coordinates)". However, I see multiple mutually exclusive events where exon 3 comes before exon 2 in genomic coordinates (this happens in genes on either strand, see attached). I tried to find a unifying way for the naming and it seems exon 3 was always longer or same size as exon 2. This would be consistent with naming the other splicing events (isoform 2 is always the longer isoform).

I think it would be easier to interpret the splicing events if the skipped exons in mutually exclusive events were defined by genomic coordinates, so that isoforms 1 and 2 can be identified easily, as it is not immediately clear which exon is longer in a mutually exclusive event. I can switch the naming quickly but this affects PSI calculation too, correct?

Screenshot 2023-03-21 181126

spladder build --bams sorted6070.bam,sorted6071.bam,sorted6072.bam,sorted6073.bam,sorted6074.bam,sorted6075.bam,sorted6076.bam \
  --outdir ~/rna/myositis//muscle/splicing/ \
  --annotation ~/rna/nw/gencode.v42.primary_assembly.annotation.gtf \
  --parallel 8 --set-mm-tag nM --primary-only --readlen 50