genomicAnnotationPriority ChIPseeker v1.36.0

Question

genomicAnnotationPriority ChIPseeker v1.36.0

HAOXUANmogu opened this issue a year ago · 6 comments

Hi,

met a problem with ChIPseeker recently.

The first one is the region priority problem with "genomicAnnotationPriority"

My question is:
when I use genomicAnnotationPriority = c("3UTR", "5UTR", "Promoter", "Exon", "Intron", "Downstream", "Intergenic"), the annotation file shows both 3'UTR and 5UTR region;

when I use genomicAnnotationPriority = c("Exon", "Intron", "3UTR", "5UTR", "Promoter", "Downstream", "Intergenic"), the annotation file shows neither 3'UTR nor 5UTR region;

The second one is the strand problem with "sameStrand = TRUE", it seems not working.

Here is my code list below:

library(ChIPseeker)

library(GenomicFeatures)

tair_10 <- makeTxDbFromGFF("TAIR10.release55.gtf")

peak <-readPeakFile("test.tsv")

peakAnno <-annotatePeak(peak, tssRegion=c(-3000,3000),TxDb = tair_10,
                        assignGenomicAnnotation = TRUE,
                        genomicAnnotationPriority = c("3UTR","5UTR","Promoter","Exon", "Intron","Downstream", "Intergenic"),
                        annoDb = NULL,
                        addFlankGeneInfo = FALSE,
                        flankDistance = 5000,
                        sameStrand = TRUE,
                        #ignoreOverlap = FALSE,
                        #ignoreUpstream = FALSE,
                        #ignoreDownstream = FALSE,
                        overlap = "all",
                        verbose = TRUE)

peakAnno_cluster <-as.data.frame(peakAnno)

#查看summary信息，peaks在基因组上的位置
peakAnno
plotAnnoPie(peakAnno)

test.tsv.zip
TAIR10.release55.gtf.zip

Answer 1 · 2023-10-23T11:56:15.000Z

Thank you for reaching out!
It seems that there is something wrong with your sample test.tsv file.

and there will be bug when running your code at the peak <-readPeakFile("test.tsv") , which come from the wrong format of tsv

Answer 2 · 2023-10-23T15:48:50.000Z

Ok, I should move the first lane to the last, please try the new one, I have just tried the new form, it is working
testnew.tsv.zip

Answer 3 · 2023-10-24T03:23:28.000Z

Thank you for your feed back!
There is still something wrong with your file, and i correct it for you according to my understandings. Please check whether if this file can represent your information.
i correct the format according to standard of bed file(https://genome.ucsc.edu/FAQ/FAQformat.html#format1)

test.bed.txt

It would be helpful to me if you can provide me some information about your file. It seems that it is an output of methylation ? But it is a little different from the regular methylation out. If it is something like methylation sequencing having peak of one base, the file should be like

Since ChIPseeker analysis data based on the data structure of bed file, a correct input based on your actual need is important.

Answer 4 · 2023-10-25T19:11:57.000Z

Yes, it is an output of methylation, this is just a demo of the input file, a form like I need to use, it is not the real output data, you can adjust it to any format you need, and I can follow you to adjust my data/

Answer 5 · 2023-10-25T22:39:14.000Z

I have tried your bed file, you have moved the strand to the sixth lane, but it still not working, it still show"*"

This is the annotated form I got:

anno_test.bed.txt

Answer 6 · 2023-10-31T09:51:18.000Z

Thank you for your feedback!
For question you mention, the meaning of genomicAnnotationPriority is that a region can only have one annotation according to your need, which means that it can only be 5'UTR or exon. You can check other annotation in this way.

peakAnno <-annotatePeak(peak, tssRegion=c(-3000,3000),TxDb = tair_10,
                        assignGenomicAnnotation = TRUE,
                        genomicAnnotationPriority = c("Exon", "Intron", "3UTR", "5UTR", "Promoter", "Downstream", "Intergenic"),
                        annoDb = NULL,
                        addFlankGeneInfo = FALSE,
                        flankDistance = 5000,
                        sameStrand = FALSE,
                        #ignoreOverlap = FALSE,
                        #ignoreUpstream = FALSE,
                        #ignoreDownstream = FALSE,
                        overlap = "all",
                        verbose = TRUE)

detail <- peakAnno@detailGenomicAnnotation
table(detail$fiveUTR)
#r$> table(detail$fiveUTR)
#
#FALSE  TRUE 
#15392  1164

And for the strand information, we will update the function in the near future.
you can try to add strand information using

# df is the data.frame obtained from bed file
# column x is the column containing strand information
strand(peak) <- df[,x]

and the you can perform your analysis with strand information. sameStrand will work.