nanoporetech/dorado

dorado correct outputting empty fasta file

Closed this issue · 2 comments

Issue Report

I'm running the following code after basecalled and demultiplexed with dorado

$dorado correct read demultip/T4-1.fastq > error-corrected/T4.fasta

And getting a empty fasta file and the following error

$ /home/amoha30/dorado/bin/dorado correct read demultip/T4-1.fastq --to-paf --devopts > error-correcte
d/T4.paf
[2024-10-17 09:16:03.985] [info] Running: "correct" "read" "demultip/T4-1.fastq" "--to-paf" "--devopts"
[2024-10-17 09:16:03.985] [error] Zero positional arguments expected, did you mean --devopts VAR
Usage: dorado correct [--help] [--threads VAR] [--infer-threads VAR] [--device VAR] [--verbose]... [--model-path VAR] [--from-paf VAR] [--to-paf] [--resume-from VAR] [--batch-size VAR] [--index-size VAR] reads

Dorado read correction tool

Positional arguments:
  reads             Path to a file with reads to correct in FASTQ format. 

Optional arguments:
  -h, --help        shows help message and exits 
  -t, --threads     Number of threads for processing. Default uses all available threads. [nargs=0..1] [default: 0]
  --infer-threads   Number of threads per device. [nargs=0..1] [default: 2]
  -x, --device      Specify CPU or GPU device: 'auto', 'cpu', 'cuda:all' or 'cuda:<device_id>[,<device_id>...]'. Specifying 'auto' will choose either 'cpu', 'metal' or 'cuda:all' depending on the presence of a GPU device. [nargs=0..1] [default: "auto"]
  -v, --verbose     [may be repeated]

Input/output arguments (detailed usage):
  -m, --model-path  Path to correction model folder. 
  -p, --from-paf    Path to a PAF file with alignments. Skips alignment computation. 
  --to-paf          Generate PAF alignments and skip consensus. 
  --resume-from     Resume a previously interrupted run. Requires a path to a file where sequence headers are stored in the first column (whitespace delimited), one per row. The header can also occupy a full row with no other columns. For example, a .fai index generated from the previously corrected output FASTA file is a valid input here. [nargs=0..1] [default: ""]

Advanced arguments (detailed usage):
  -b, --batch-size  Batch size for inference. Default: 0 for auto batch size detection. [nargs=0..1] [default: 0]
  -i, --index-size  Size of index for mapping and alignment. Default 8G. Decrease index size to lower memory footprint. [nargs=0..1] [default: "8G"]

Someone please help.

Run environment:

  • Dorado version:0.8.1+c3a2952
  • Dorado command:
  • Operating system: linux
  • Hardware (CPUs, Memory, GPUs): CPU 64, Memory 512GB, GPU 160GB
    :

Thanks

Logs

  • Please provide output trace of dorado (run dorado with -v, or -vv on a small subset)

Don't use the -devopts cmd - this is my example for the cpu part.

dorado correct -t $task.cpus -m /data/models/herro-v1 $fastq --to-paf > ${prefix2}_overlaps.paf

Closing as user error - invalid parameters supplied, which dorado has correctly reported.