/rerio

Research release basecalling models and configurations

Primary LanguagePythonOtherNOASSERTION

/ONT_logo.png


Rerio

Rerio is comprised of "research release" basecalling models and configuration files.

The research models provide cutting-edge functions, speeds and accuracies that have not been productionised or validated by Oxford Nanopore Technologies. Nevertheless, models and config files can be run by using the instructions available in this repository.

Models are provided for DNA and RNA, various pore types and to basecall different modified bases in a variety of contexts.

Features

  • New and advanced research models that are at the forefront of nanopore sequencing analyses (e.g. highest accuracies, quickest speeds, more modified basecalls in more contexts)

Note: The results generated by research basecalling models have not been scrutinized nor validated and Oxford Nanopore cannot support each use case (see Research Release disclaimer)


Getting started

Follow the instructions below to use research models with Dorado executable. See further details for running Dorado here.

Dependencies

You will require:

Installation

Rerio can be downloaded by cloning from GitHub git clone https://github.com/nanoporetech/rerio. Once Rerio has been downloaded, models can be downloaded via the download_model.py script.

# Download all models
rerio/download_model.py
# Download specific model(s)
rerio/download_model.py rerio/dorado_models/res_dna_r10.4.1_e8.2_400bps_sup@v4.0.1_url

Dorado Models

Config DNA/RNA Chemistry Device Tested Dorado Notes
res_dna_r10.4.1_e8.2_400bps_sup@2023-09-22_bacterial-methylation DNA R10.4.1 E8.2 All v0.3.4 Kit 14 5kHz Research model with an increased range of bacterial methylation motifs
res_dna_r10.4.1_e8.2_400bps_sup@v4.0.1 DNA R10.4.1 E8.2 All v0.3.0 Kit 14 4kHz Model Compatible with All-context Modified Bases
res_dna_r10.4.1_e8.2_400bps_sup@v4.0.1_5mC@v2 DNA R10.4.1 E8.2 All v0.3.0 Kit 14 4kHz 5mC All-context Modified Base Model
res_dna_r10.4.1_e8.2_400bps_sup@v4.0.1_6mA@v2 DNA R10.4.1 E8.2 All v0.3.0 Kit 14 4kHz 6mA All-context Modified Base Model
res_dna_r10.4.1_e8.2_400bps_sup@v4.3.0_4mC_5mC@v1 DNA R10.4.1 E8.2 All v0.4.3 Kit 14 5kHz 4mC+5mC All-context Modified Base Model
#  Download all models
python3 download_model.py --dorado
#  Download particular model
python3 download_model.py --dorado dorado_models/res_dna_r10.4.1_e8.2_400bps_sup@v4.3.0_4mC_5mC@v1_url

Each model will be downloaded to dorado_models/{config}.

Basecalling models can be supplied directly to Dorado. Please refer to the Dorado README for more information on how to run basecalling and use modified base models.


Clair3 Models

Clair3 models for the following configurations are available:

Latest:

Config Chemistry Dorado basecaller model
r1041_e82_400bps_sup_v500 R10.4.1 E8.2 (5kHz) v5.0.0 SUP
r1041_e82_400bps_hac_v500 R10.4.1 E8.2 (5kHz) v5.0.0 HAC
r1041_e82_400bps_sup_v410 R10.4.1 E8.2 (4kHz) v4.1.0 SUP
r1041_e82_400bps_hac_v410 R10.4.1 E8.2 (4kHz) v4.1.0 HAC

Deprecated:

Config Chemistry Dorado basecaller model Guppy basecaller
r1041_e82_400bps_sup_v430 R10.4.1 E8.2 (5kHz) v4.3.0 SUP  
r1041_e82_400bps_hac_v430 R10.4.1 E8.2 (5kHz) v4.3.0 HAC  
r1041_e82_400bps_sup_v420 R10.4.1 E8.2 (5kHz) v4.2.0 SUP  
r1041_e82_400bps_hac_v420 R10.4.1 E8.2 (5kHz) v4.2.0 HAC  
r1041_e82_260bps_sup_v400 R10.4.1 E8.2 v4.0.0 SUP
r1041_e82_260bps_hac_v400 R10.4.1 E8.2 v4.0.0 HAC
r1041_e82_260bps_fast_g632 R10.4.1 E8.2 v3.5.2 FAST v6.3.2 FAST
r1041_e82_400bps_sup_g615 R10.4.1 E8.2 v3.5.2 SUP v6.1.5 SUP
r1041_e82_400bps_hac_g632 R10.4.1 E8.2 v3.5.2 HAC v6.3.2 HAC
r1041_e82_400bps_hac_g615 R10.4.1 E8.2
v6.1.5 HAC
r1041_e82_400bps_fast_g615 R10.4.1 E8.2
v6.1.5 FAST
r1041_e82_260bps_sup_g632 R10.4.1 E8.2 v3.5.2 SUP v6.3.2 SUP
r1041_e82_260bps_hac_g632 R10.4.1 E8.2 v3.5.2 HAC v6.3.2 HAC
r1041_e82_400bps_fast_g632 R10.4.1 E8.2 v3.5.2 FAST
r104_e81_sup_g5015 R10.4 E8.1
v5.0.15 SUP
r104_e81_hac_g5015 R10.4 E8.1
v5.0.15 HAC
#  Download all models
python3 download_model.py --clair3
#  Download particular model
python3 download_model.py --clair3 clair3_models/{config}_model

Each model will be downloaded to the folder clair3_models/{config}.


Remora Models

Most Remora models are supplied along with the Remora repository, but models with less validation intended for research purposes will be released in Rerio.

Config DNA/RNA Pore Device Tested Guppy Notes
5mC_all_context_sup_r1041_e82 DNA R10.4.1 Any v6.1.2 5mC in all context (with SUP basecaller)
#  Download all models
python3 download_model.py --remora
#  Download particular model
python3 download_model.py --remora remora_models/5mC_all_context_sup_r1041_e82

Each model will be downloaded to remora_models/{config}.pt (or remora_models/{config}.onnx for Remora version <2.0).

These models can be supplied directly to Bonito via the --modified-base-model argument.


Guppy models

This section contains research release Guppy compatible models. See Nanopore Community page for download/install instructions. Since research models often utilise new features, the latest version of Guppy may be required.

Config DNA/RNA Pore Device Tested Guppy Notes
res_dna_r9.4.1_e8.1_{fast,hac,sup}_v033.cfg DNA R9.4.1 All v5.0.11 Kit 12 E8.1 CRF Models
res_dna_r941_min_crf_v032.cfg DNA R9.4.1 MinION/GridION v4.4.0 Bonito CRF
res_dna_r103_min_crf_v032.cfg DNA R10.3 MinION/GridION v4.4.0 Bonito CRF
res_dna_r103_q20ea_crf_v033.cfg DNA R10.3 PromethION v5.0.11 Q20 early access CRF
res_dna_r103_q20ea_crf_v034.cfg DNA R10.3 PromethION v5.0.11 Q20 early access CRF
res_dna_r941_min_flipflop_v001.cfg DNA R9.4.1 MinION/GridION v3.5.1  
res_dna_r941_min_dUfast_v001.cfg DNA R9.4.1 MinION/GridION v3.5.1 Calls dU as dT (fast)
res_dna_r941_min_dUhac_v001.cfg DNA R9.4.1 MinION/GridION v3.5.1 Calls dU as dT (high acc.)
res_dna_r941_min_rle_v001.cfg DNA R9.4.1 MinION/GridION v3.5.1  
res_dna_r103_min_flipflop_v001.cfg DNA R10.3 MinION/GridION v3.5.1  
res_dna_r103_prom_rle_v001.cfg DNA R10.3 PromethION v3.5.1  
res_rna2_r941_min_flipflop_v001.cfg RNA2 R9.4.1 MinION/GridION v3.5.1  

Barcoding Support

The Rerio GitHub code repository includes a minimal barcoding stub to allow Guppy to run successfully. In order to enable full Guppy barcoding capabilities, all barcoding files must be transferred from the guppy data directory to the rerio data directory.

cp ont-guppy/data/barcoding/* rerio/basecall_models/barcoding/

Taiyaki Models

Taiyaki checkpoint files corresponding to Rerio research models are provided. Not all of these are compatible with the public release of Taiyaki.

#  Download all models
python3 download_models.py --checkpoints
#  Download particular model
python3 download_models.py --checkpoints taiyaki_checkpoint/model

Licence and Copyright

© 2020-2023 Oxford Nanopore Technologies Ltd.

Rerio is distributed under the terms of the Oxford Nanopore Technologies, Ltd. Public License, v. 1.0. If a copy of the License was not distributed with this file, You can obtain one at http://nanoporetech.com

Research Release

Research releases are provided as technology demonstrators to provide early access to features or stimulate Community development of tools. Support for this software will be minimal and is only provided directly by the developers. Feature requests, improvements, and discussions are welcome and can be implemented by forking and pull requests. However much as we would like to rectify every issue and piece of feedback users may have, the developers may have limited resource for support of this software. Research releases may be unstable and subject to rapid iteration by Oxford Nanopore Technologies.