About the directionality of the stoichiometry change

Question

About the directionality of the stoichiometry change

Renown-TAL opened this issue 3 years ago · 7 comments

Hello,
In the step "Estimate modification frequency difference between two samples" of "RNA modification stoichiometry estimation using Tombo resquiggling" , you note that "KMEANS does not accurately assign directionality of the stoichiometry change, whereas KNN does". However, when I mutual exchange the input file(like run "per_read/get_freq.py -f $ref -b $f.bed -o $f.bed.tsv.gz -1 a.bam -2 b.bam" and "per_read/get_freq.py -f $ref -b $f.bed -o $f.bed.tsv.gz -1 b.bam -2 a.bam"), the output data of mod_freq diff knn have nothing change.
Should I use mismatch error to get the directionality in mod_freq diff knn like mod_freq diffkmeans?
Thanks you in advance.

Answer 1 · 2021-10-14T07:11:34.000Z

Hi @Renown-TAL - can you pls share some reproducible code example with corresponding result?
With regards to your second question, yes, if you are interested in an RNA modification that causes a basecalling error in the form of mismatch (e.g. pseudouridine), you can use mismatch error to get the directionality.
Thanks!

Answer 2 · 2021-10-14T08:27:55.000Z

the code is as folloing: /home/mjy/software/nanoRMS/per_read/get_freq.py -f /DataM/Data/hg38/gencode.v38.transcripts.fa -b ENST00000361624.bed -o h_t.bed.tsv -1 head_bam/total.sort.bam -2 tail_bam/total.sort.bam /home/mjy/software/nanoRMS/per_read/get_freq.py -f /DataM/Data/hg38/gencode.v38.transcripts.fa -b ENST00000361624.bed -o t_h.bed.tsv -2 head_bam/total.sort.bam -1 tail_bam/total.sort.bam I check the output that some "mod_freq diff knn" have little change of num, but I expect all knn result with "+" will change into "-" after exchanging the input because the directionality.

…

------------------ 原始邮件 ------------------ 发件人: "novoalab/nanoRMS" ***@***.***>; 发送时间: 2021年10月14日(星期四) 下午3:11 ***@***.***>; ***@***.******@***.***>; 主题: Re: [novoalab/nanoRMS] About the directionality of the stoichiometry change (#21) Hi @Renown-TAL - can you pls share some reproducible code example with corresponding result? With regards to your second question, yes, if you are interested in an RNA modification that causes a basecalling error in the form of mismatch (e.g. pseudouridine), you can use mismatch error to get the directionality. Thanks! — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. 从QQ邮箱发来的超大附件 ENST00000361624.zip (107.96M, 2021年11月13日 16:09 到期)进入下载页面：http://mail.qq.com/cgi-bin/ftnExs_download?t=exs_ftn_download&k=7c393164a5aa23ca422748714433511c030009525a07070b19095700001e57570d5c1c055156021e550055025204550556580506622063767a6a655452035303070f005250074d495d493159&code=491db3c3

Answer 3 · 2021-10-14T08:48:16.000Z

Hi @Renown-TAL - can you pls share some reproducible code example with corresponding result? With regards to your second question, yes, if you are interested in an RNA modification that causes a basecalling error in the form of mismatch (e.g. pseudouridine), you can use mismatch error to get the directionality. Thanks!

I send the data and code by email, please check your mailbox. Thanks.

Answer 4 · 2021-10-18T07:17:12.000Z

Beside, I find that "/nanoRMS/per_read/get_freq.py" line 87 write the condition "if o.mincov<10", but in your note is at least 5.

Answer 5 · 2021-10-18T08:21:47.000Z

Hi @Renown-TAL - I am not sure who you have sent the data and code by email? The maintainer of this part of the code is @lpryszcz , however I'd like to note that he is currently on paternity leave, but he will look at it as soon as he is back. Thank you for your patience.

Answer 6 · 2021-10-18T08:32:00.000Z

Hi @Renown-TAL - I am not sure who you have sent the data and code by email? The maintainer of this part of the code is @lpryszcz , however I'd like to note that he is currently on paternity leave, but he will look at it as soon as he is back. Thank you for your patience.

I made a mistake, now I sent to the "evamaria.novoa@gmail.com", please check your mailbox. Thanks.

Answer 7 · 2021-11-11T20:00:57.000Z

Hi @Renown-TAL , thanks for sending the data. The limitation of nanoRMS KNN is that it requires KO and WT sample. Are any of your samples KO (fully unmodified sample)?

If not, you can still use nanoRMS, but only in unsupervised mode (KMeans). This will cluster un- and modified reads, but we won't know which cluster is un- and modified. Therefore you can only look at the relative difference in modification frequency between two samples in this mode, but you won't know the directionality of the change. And for KMeans the (absolute) mod_freq difference should be similar no matter which is the order of files: -1 head_bam/total.sort.bam -2 tail_bam/total.sort.bam should give similar results to -2 head_bam/total.sort.bam -1 tail_bam/total.sort.bam.

Beside, I find that "/nanoRMS/per_read/get_freq.py" line 87 write the condition "if o.mincov<10", but in your note is at least 5.

This has been corrected. I'll publish it with the next push. Thanks!