modified basecalling analysis for m6a modification of RNA

Question

modified basecalling analysis for m6a modification of RNA

Closed this issue a month ago · 2 comments

Seongmin-Jang-1165 commented a month ago

Issue Report

Please describe the issue:

Hello developer

I'm planning to analyze m6a modification of RNA using DORADO modified basecalling and Modkit. i generate data wiht SQK - RNA004

Is m6A modified basecalling process based on DRACH motif? or it can figure out de novo motif?

Steps to reproduce the issue:

Please list any steps to reproduce the issue.

Run environment:

Dorado version:
Dorado command:
Operating system:
Hardware (CPUs, Memory, GPUs):
Source data type (e.g., pod5 or fast5 - please note we always recommend converting to pod5 for optimal basecalling performance):
Source data location (on device or networked drive - NFS, etc.):
Details about data (flow cell, kit, read lengths, number of reads, total dataset size in MB/GB/TB):
Dataset to reproduce, if applicable (small subset of data to share as a pod5 to reproduce the issue):

Logs

Please provide output trace of dorado (run dorado with -v, or -vv on a small subset)

Answer 1 · 2024-11-07T09:21:32.000Z

Hi @Seongmin-Jang-1165,

Models are available for all-context and DRACH, depending on the basecall model:

dorado download --list 2> >(grep m6A)

[2024-11-07 09:15:11.150] [info]  - rna004_130bps_sup@v3.0.1_m6A_DRACH@v1
[2024-11-07 09:15:11.150] [info]  - rna004_130bps_hac@v5.0.0_m6A@v1
[2024-11-07 09:15:11.150] [info]  - rna004_130bps_sup@v5.0.0_m6A@v1
[2024-11-07 09:15:11.150] [info]  - rna004_130bps_hac@v5.0.0_m6A_DRACH@v1
[2024-11-07 09:15:11.150] [info]  - rna004_130bps_sup@v5.0.0_m6A_DRACH@v1
[2024-11-07 09:15:11.150] [info]  - rna004_130bps_hac@v5.1.0_inosine_m6A@v1
[2024-11-07 09:15:11.150] [info]  - rna004_130bps_sup@v5.1.0_inosine_m6A@v1
[2024-11-07 09:15:11.150] [info]  - rna004_130bps_hac@v5.1.0_m6A_DRACH@v1
[2024-11-07 09:15:11.150] [info]  - rna004_130bps_sup@v5.1.0_m6A_DRACH@v1

If you are using the v5.1.0 basecall models then the all-context model is inosine_m6A. If you are only interested in m6A, you can use modkit to remove the inosine calls (with --ignore 17596), which should achieve similar results to previous m6A models.

Answer 2 · 2024-11-08T05:44:55.000Z

@malton-ont

Thanks for advice! i'll try it