iprada/Circle-Map

questions about quantification

Opened this issue · 2 comments

Hi Inigo,
‘’If there are the strong disagreements between the number of discordant reads and split reads the circular DNA should be handled with care. As an example, if a circular DNA contains tens or hundreds of discordant read pairs supporting it and only 1-5 split reads we suggest the circle is interpreted with care.‘’
I am really really new to DNAseq,I don't quite understand the sentence above,how should I interprete that case.And if I want to use DEseq2 for DE analysis,should i combine the discordant read pairs and the split reads as the read count in that case,As my results show below,should i delete this two rows.Beside,should i remove duplicates with picard for my analysis,I'm still not sure about it after reading all the other issues.I am really looking forward to your reply.

chromosome start coordinate end coordinate discordants split reads circle score mean coverage Standard deviation This column indicates the standard deviation of the base coverage vector Coverage increase in the start coordinate Coverage increase in the end coordinate Coverage continuity
NC_000932.1 45761 49181 42 0 0 381.6959064 980.2346833 0.479161497 1 0.625438596
NC_000932.1 75849 76639 77 0 0 640.4202532 541.9309458 0.444813264 0.988444203 0.34556962

best
a student

Dear a student, ;)

I will be happy to help, however, I will need more information:

  • Type of data: protocol, circle enrichment procedure (if any)...
  • Scientific question you want to answer. If you do not want answer this publicly, feel free to drop me a mail.

best,

Iñigo Prada

Hi Inigo, ‘’If there are the strong disagreements between the number of discordant reads and split reads the circular DNA should be handled with care. As an example, if a circular DNA contains tens or hundreds of discordant read pairs supporting it and only 1-5 split reads we suggest the circle is interpreted with care.‘’ I am really really new to DNAseq,I don't quite understand the sentence above,how should I interprete that case.And if I want to use DEseq2 for DE analysis,should i combine the discordant read pairs and the split reads as the read count in that case,As my results show below,should i delete this two rows.Beside,should i remove duplicates with picard for my analysis,I'm still not sure about it after reading all the other issues.I am really looking forward to your reply.

chromosome start coordinate end coordinate discordants split reads circle score mean coverage Standard deviation This column indicates the standard deviation of the base coverage vector Coverage increase in the start coordinate Coverage increase in the end coordinate Coverage continuity
NC_000932.1 45761 49181 42 0 0 381.6959064 980.2346833 0.479161497 1 0.625438596
NC_000932.1 75849 76639 77 0 0 640.4202532 541.9309458 0.444813264 0.988444203 0.34556962
best a student

Actually, I think you can't use these two eccDNAs to do any analysis because they both have no split reads. You know that Circle-Map needs to use soft-clipped reads to realign to get the precise junction site. 

For another question, I think you can't use split reads and discordant reads to do DE analysis because these are only reads close to the eccDNA-junction site. However, eccDNA also contains many other reads properly aligned to the genome. In addition, RCA will have different amplification preferences, so it's not correct to use that to do any DE analysis.