About the directionality of the stoichiometry change
Renown-TAL opened this issue · 7 comments
Hello,
In the step "Estimate modification frequency difference between two samples" of "RNA modification stoichiometry estimation using Tombo resquiggling" , you note that "KMEANS does not accurately assign directionality of the stoichiometry change, whereas KNN does". However, when I mutual exchange the input file(like run "per_read/get_freq.py -f $ref -b $f.bed -o $f.bed.tsv.gz -1 a.bam -2 b.bam" and "per_read/get_freq.py -f $ref -b $f.bed -o $f.bed.tsv.gz -1 b.bam -2 a.bam"), the output data of mod_freq diff knn have nothing change.
Should I use mismatch error to get the directionality in mod_freq diff knn like mod_freq diffkmeans?
Thanks you in advance.
Hi @Renown-TAL - can you pls share some reproducible code example with corresponding result?
With regards to your second question, yes, if you are interested in an RNA modification that causes a basecalling error in the form of mismatch (e.g. pseudouridine), you can use mismatch error to get the directionality.
Thanks!
Hi @Renown-TAL - can you pls share some reproducible code example with corresponding result? With regards to your second question, yes, if you are interested in an RNA modification that causes a basecalling error in the form of mismatch (e.g. pseudouridine), you can use mismatch error to get the directionality. Thanks!
I send the data and code by email, please check your mailbox. Thanks.
Beside, I find that "/nanoRMS/per_read/get_freq.py" line 87 write the condition "if o.mincov<10", but in your note is at least 5.
Hi @Renown-TAL - I am not sure who you have sent the data and code by email? The maintainer of this part of the code is @lpryszcz , however I'd like to note that he is currently on paternity leave, but he will look at it as soon as he is back. Thank you for your patience.
Hi @Renown-TAL - I am not sure who you have sent the data and code by email? The maintainer of this part of the code is @lpryszcz , however I'd like to note that he is currently on paternity leave, but he will look at it as soon as he is back. Thank you for your patience.
I made a mistake, now I sent to the "evamaria.novoa@gmail.com", please check your mailbox. Thanks.
Hi @Renown-TAL , thanks for sending the data. The limitation of nanoRMS KNN is that it requires KO and WT sample. Are any of your samples KO (fully unmodified sample)?
If not, you can still use nanoRMS, but only in unsupervised mode (KMeans). This will cluster un- and modified reads, but we won't know which cluster is un- and modified. Therefore you can only look at the relative difference in modification frequency between two samples in this mode, but you won't know the directionality of the change. And for KMeans the (absolute) mod_freq difference should be similar no matter which is the order of files: -1 head_bam/total.sort.bam -2 tail_bam/total.sort.bam
should give similar results to -2 head_bam/total.sort.bam -1 tail_bam/total.sort.bam
.
Beside, I find that "/nanoRMS/per_read/get_freq.py" line 87 write the condition "if o.mincov<10", but in your note is at least 5.
This has been corrected. I'll publish it with the next push. Thanks!