methylation CpG model vs all-context model
PanZiwei opened this issue · 3 comments
Hi,
I found there are two models relevant to 5mC: res_dna_r941_min_modbases_5mC_CpG_v001
and res_dna_r941_min_modbases_5mC_v001
. So the former one is trained on CpG only, and the latter one is trained in all-context? Do they have any model performance difference? Which one is better if I am interested in detecting human CpG methylation?
Currently, I am using Megalodon with res_dna_r941_min_modbases_5mC_v001
for base modification identification, how can filter out non-CG pattern in the results? Is it possible to provide Megalodon with options --alternate-bases CpG
or --alternate-bases 5mC
like Tombo?
The res_dna_r941_min_modbases_5mC_v001
all-context model is the recommended model. The res_dna_r941_min_modbases_5mC_CpG_v001
model is still available for use, but it not listed in the summary table in the README as it is no longer recommended for use. The newer res_dna_r941_min_modbases_5mC_v001
model performs well in CH
contexts and performs the CpG model is CG
contexts.
For filtering to CG
results megalodon
provides the --mod-motif
argument (for CG
contexts set to --mod-motif m CG 0
). See further documentation here and on the command line using the megalodon -h
command.
Wait...I think it should be typos? res_dna_r941_min_modbases_5mC_v001
should be all-context model and it is the recommended model, not the res_dna_r941_min_modbases_5mC_CpG_v001
?
Yes. You are correct. Apologies. I will amend my previous comment to show the correct models. Good spot!