nanoporetech/rerio

Using rerio with megalodon

Naish-M opened this issue · 7 comments

Hi Marcus,

I am trying to use the res_dna_r941_min_modbases-all-context_v001.cfg file with megalodon (v2.1) but fails and gives an error in the guppy_log.err of:
Error parsing config file: 'the options configuration file contains an invalid line 'https://nanoporetech.box.com/shared/static/5siqked80fcxchtdq7m45odhsw1580o1.tgz''.

Any ideas?

Thanks, Matt

full command to run megalodon:-
megalodon ~/rds/hpc-work/NanoporeFast5/ --outputs mods --reference ~/Athaliana_ONT_RaGOO_v01.fa --devices cuda:0 cuda:1 cuda:2 cuda:3 --processes 24 --overwrite --guppy-params "-d ./rerio/basecall_models/" --guppy-config res_dna_r941_min_modbases-all-context_v001.cfg

Did the download of this model from Rerio complete successfully? The files listed in the basecall_models directory of the repo are stubs upon cloning the repository. The contents of these stubs are the web addresses of the config files (such as the one in this error). The full config must be downloaded per the instructions in Rerio.

Yes I downloaded all the models with rerio/download_model.py with no errors and then tried to launch megalodon using the command above but get the Guppy initialization error above

What does the contents of ./rerio/basecall_models/res_dna_r941_min_modbases-all-context_v001.cfg look like? Basically is it a single line with the url or does it look like a guppy config file?

it is just a single line with the url - doesn't look like a usual config file

That sounds like the contents of the stub file ./rerio/basecall_models/res_dna_r941_min_modbases-all-context_v001 (no suffix) has somehow been copied to the configuration file. Could you run python3 download_models.py basecall_models/res_dna_r941_min_modbases-all-context_v001 again please?

That is odd. I've just tested the model download and it appears to proceed as expected:

mstoiber-mac1:10:26:43:~/Documents/.../public_rerio$ git clone https://github.com/nanoporetech/rerio
Cloning into 'rerio'...
remote: Enumerating objects: 149, done.
remote: Counting objects: 100% (149/149), done.
remote: Compressing objects: 100% (87/87), done.
remote: Total 149 (delta 70), reused 134 (delta 58), pack-reused 0
Receiving objects: 100% (149/149), 61.71 KiB | 726.00 KiB/s, done.
Resolving deltas: 100% (70/70), done.
mstoiber-mac1:10:27:17:~/Documents/.../public_rerio$ ./rerio/download_model.py ./rerio/basecall_models/res_dna_r941_min_modbases-all-context_v001 
Models to download
  1: res_dna_r941_min_modbases-all-context_v001
Downloading res_dna_r941_min_modbases-all-context_v001
mstoiber-mac1:10:28:05:~/Documents/.../public_rerio$ cat rerio/basecall_models/res_dna_r941_min_modbases-all-context_v001.cfg 
# Basic configuration file for ONT Guppy basecaller software.

# Data trimming.
trim_strategy                       = dna
trim_threshold                      = 2.5
trim_min_events                     = 3

# Basecalling.
model_file                          = res_dna_r941_min_modbases-all-context_v001.jsn
chunk_size                          = 1000
gpu_runners_per_device              = 4
chunks_per_runner                   = 512
chunks_per_caller                   = 10000
overlap                             = 50
qscore_offset                       = 0
qscore_scale                        = 1.0
builtin_scripts                     = 1

# Calibration strand detection
calib_reference                     = lambda_3.6kb.fasta
calib_min_sequence_length           = 3000
calib_max_sequence_length           = 3800
calib_min_coverage                  = 0.6

# Output.
records_per_fastq                   = 4000
min_qscore                          = 7.0

# Telemetry
ping_url                            = https://ping.oxfordnanoportal.com/basecall
ping_segment_duration               = 60

Could you try the download again.

Running:
python3 download_models.py basecall_models/res_dna_r941_min_modbases-all-context_v001
seems to have solved it I now have the full .cfg
Thanks for your help