nanoporetech/dorado

modified basecalling analysis for m6a modification of RNA

Closed this issue · 2 comments

Issue Report

Please describe the issue:

Hello developer

I'm planning to analyze m6a modification of RNA using DORADO modified basecalling and Modkit. i generate data wiht SQK - RNA004

Is m6A modified basecalling process based on DRACH motif? or it can figure out de novo motif?

Steps to reproduce the issue:

Please list any steps to reproduce the issue.

Run environment:

  • Dorado version:
  • Dorado command:
  • Operating system:
  • Hardware (CPUs, Memory, GPUs):
  • Source data type (e.g., pod5 or fast5 - please note we always recommend converting to pod5 for optimal basecalling performance):
  • Source data location (on device or networked drive - NFS, etc.):
  • Details about data (flow cell, kit, read lengths, number of reads, total dataset size in MB/GB/TB):
  • Dataset to reproduce, if applicable (small subset of data to share as a pod5 to reproduce the issue):

Logs

  • Please provide output trace of dorado (run dorado with -v, or -vv on a small subset)

Hi @Seongmin-Jang-1165,

Models are available for all-context and DRACH, depending on the basecall model:

dorado download --list 2> >(grep m6A)

[2024-11-07 09:15:11.150] [info]  - rna004_130bps_sup@v3.0.1_m6A_DRACH@v1
[2024-11-07 09:15:11.150] [info]  - rna004_130bps_hac@v5.0.0_m6A@v1
[2024-11-07 09:15:11.150] [info]  - rna004_130bps_sup@v5.0.0_m6A@v1
[2024-11-07 09:15:11.150] [info]  - rna004_130bps_hac@v5.0.0_m6A_DRACH@v1
[2024-11-07 09:15:11.150] [info]  - rna004_130bps_sup@v5.0.0_m6A_DRACH@v1
[2024-11-07 09:15:11.150] [info]  - rna004_130bps_hac@v5.1.0_inosine_m6A@v1
[2024-11-07 09:15:11.150] [info]  - rna004_130bps_sup@v5.1.0_inosine_m6A@v1
[2024-11-07 09:15:11.150] [info]  - rna004_130bps_hac@v5.1.0_m6A_DRACH@v1
[2024-11-07 09:15:11.150] [info]  - rna004_130bps_sup@v5.1.0_m6A_DRACH@v1

If you are using the v5.1.0 basecall models then the all-context model is inosine_m6A. If you are only interested in m6A, you can use modkit to remove the inosine calls (with --ignore 17596), which should achieve similar results to previous m6A models.

@malton-ont

Thanks for advice! i'll try it