After basecallling with dorado, nanopolish was unable to recognize methylation information
happier21 opened this issue · 7 comments
Dear all,
In the Quickstart-calling methylation with nanopolish section of the nanopolish usage instructions, guppy basecaller is used to identify the base signal of reads. I replaced guppy with ont's newly recommended basecalling tool, dorado. The rest of the steps remained the same, but the methylation information was significantly less than with guppy.Is it because dorado and nanopolish are not compatible?
I examined my process log and found that while the number of reads was still high when using the nanopolish index and minimap2, the number of reads decreased significantly when using nanopolish call-methylation.
Thank you,
ShengquanWang
Is the data new R10 data?
Yes, the data is new R10 data
nanopolish doesn't support r10 data yet. You can try f5c which is an optimised re-implementation of the index, call-methylation and eventalign modules in nanopolish that also supports r10 and rna004.
Thank you for your help, this method seems to work, but when running f5c call-methylation, I find another problem. Through the log of f5c call-methylation, It was found that the quality of dorado basecaller's reads was significantly lower than that of guppy basecaller's reads. Why this happened?
This is the log of f5c call-methylation
I then to use the dorado basecaller data run "samtools view -b -q 20 -F 4 test.sorted.bam > test.sorted.q20.mapped.bam" and calculate the number of reads in bam file. The result is as follows
I do the same with guppy basecaller's data. The result is as follows
Why is this quantity so different
Could you please open an issue on the f5c repo? I will answer there.
What is the mapper you are using - MInimap2? If MInimap2 aligns well for Guppy and not DOrado - Might be something with Dorado - are you using the correct model?
Thank you very much for your help!
This is the full log of the f5c call-methylation:
This is dorado basecaller's order:
dorado basecaller /share/home/yzwl_hanxs/app/dorado-0.5.3-linux-x64/model/dna_r10.4.1_e8.2_400bps_sup@v4.1.0 ./pod5/ | amtools view -bhS -@ 10 > test.bam
Convert bam to fastq:
samtools fastq -0 test.fastq test.bam
Use minimap2 to align:
minimap2 -a -x map-ont /share/home/yzwl_hanxs/refdata-gex-GRCh38-2020-A/fasta/genome.fa test.fastq | samtools sort -o test.sorted.bam -T test.tmp
This is the log for minimap2:
Could you open an issue with this log at https://github.com/hasindu2008/f5c/issues as this more relevant there now.