How to understand otherExon
lbwfff opened this issue · 7 comments
Hi Jianhong,
I have some questions about the concept of otherExon of the genomicElementDistribution function, how can I understand this concept, and what kind of peaks will be considered to come from belonging to otherExon.
Thanks,
LeeLee
Hi LeeLee,
Thank you for trying ChIPpeakAnno to annotate your data. And sorry for the unclear documentation. otherExon
is defined as the exons extracted from TxDb object that not overlap with any 5'UTR, 3'UTR and CDS. In most cases, they are single exon transcripts such as short noncoding.
Hi Jianhong,
Thanks for your reply, I have understood the problem, but I still have some doubts.
For example in my data:
> table(gr1[["peaks"]]$ExonIntron)
exon
10219
> table(gr1[["peaks"]]$Exons)
CDS otherExon utr3 utr5
4058 583 4921 657
> table(gr1[["peaks"]]$geneLevel)
geneBody geneDownstream promoter
8877 430 912
There are a total of 10219 peaks in my data, all of them are on exons. I thought that the number of peaks located in geneBody would be equal to CDS+otherExon+utr3+utr5, but I found that the result is not the case, the number of CDS+otherExon+utr3+utr5 is equal to geneBody+geneDownstream+promoter, which means that some peak of the exon is considered to be located in the geneDownstream and the promoter at the same time. How should I understand this phenomenon?
Thanks,
LeeLee
I used the GRCh38 annotation from GENCODE, I guess because some gene exon regions were judged to be geneDownstream or promoter for some other genes. but this didn't have much impact on my subsequent analysis, so it wasn't too much of an issue.
There are 2 parameter will affect this annotation, one is keepExonsInGenesOnly, please try to set it as FALSE to see what will happen. 2 is to check the labels order, that will affect the annotation precedence. Let me know the results. Thank you.
I tried setting keepExonsInGenesOnly to T or F, but it didn't affect the results, and the order of the labels was the same. If you need, I can provide my bed file, which is a MERIP-seq data analyzed using exomepeak2.
cache.txt
Hi,
Sorry I mis-understand your first post. The total counts in Exon level should equal to Exon's count in ExonIntron level. The gene level will include promoter region, gene body (exon and intron), and downstream. The gene body does not including overlapping region with promoter and geneDownstream if you set the geneLevel order as promoter, geneDownstream and then geneBody. The geneBody is the from TSS+downtream Number in promoterRegion parameter to TES-upstream Number in geneDownstream parameter.
Hope this will help.