Error with canu assembly

Question

Error with canu assembly

Opened this issue 3 years ago · 0 comments

Hi,

I am relatively new to using nextflow. I get an error when I try to use canu assembler using the following command. May I know, how to resolve this error?

$ time nextflow run kevinmenden/hybrid-assembly --shortReads '/data01/nextflow_test/1_Raw_data/Illumina/N16005/*_L2_{1,2}.fq.gz' --longReads '/data01/nextflow_test/1_Raw_data/Nanopore/N16005_Nanopore.fastq' --assembler canu --genomeSize 2.8m -profile docker
N E X T F L O W  ~  version 21.04.3
Launching `kevinmenden/hybrid-assembly` [nostalgic_wright] - revision: c2aef5f047 [master]
=========================================
 hybrid-assembly v0.3.2dev
=========================================
WARN: Access to undefined parameter `max_memory` -- Initialise it to a default value eg. `params.max_memory = some_value`
WARN: Access to undefined parameter `max_cpus` -- Initialise it to a default value eg. `params.max_cpus = some_value`
WARN: Access to undefined parameter `max_time` -- Initialise it to a default value eg. `params.max_time = some_value`
Run Name       : nostalgic_wright
Short Reads    : /data01/nextflow_test/1_Raw_data/Illumina/N16005/*_L2_{1,2}.fq.gz
Long Reads     : /data01/nextflow_test/1_Raw_data/Nanopore/N16005_Nanopore.fastq
Fasta Ref      : false
Max Memory     : null
Max CPUs       : null
Max Time       : null
Output dir     : ./results
Working dir    : /data01/nextflow_test/1_Raw_data/work
Container      : kevinmenden/hybrid-assembly:latest
Pipeline Release: master
Current home   : /home/prakki
Current user   : prakki
Current path   : /data01/nextflow_test/1_Raw_data
Script dir     : /home/prakki/.nextflow/assets/kevinmenden/hybrid-assembly
Config Profile : docker
=========================================
executor >  local (3)
[88/c0b23d] process > get_software_versions                      [  0%] 0 of 1
[4c/49b783] process > fastqc (N16005_DDMS210004243-1a_HFMWLDSX2) [  0%] 0 of 1
[25/884a7c] process > canu (N16005_Nanopore)                     [  0%] 0 of 1
[-        ] process > minimap                                    -
[-        ] process > pilon                                      -
[-        ] process > quast_canu                                 -
[-        ] process > multiqc                                    -
Error executing process > 'canu (N16005_Nanopore)'

Caused by:
  Process `canu (N16005_Nanopore)` terminated with an error exit status (1)

Command executed:

  canu \
  -p N16005_Nanopore genomeSize=2.8m -nanopore-raw N16005_Nanopore.fastq gnuplotTested=true \
  correctedErrorRate=0.144 \
  rawErrorRate=0.500 \
  minReadLength=1000 \
  minOverlapLength=500

Command exit status:
  1

Command output:
  
  usage: canu [-correct | -trim | -assemble | -trim-assemble] \
              [-s <assembly-specifications-file>] \
               -p <assembly-prefix> \
               -d <assembly-directory> \
               genomeSize=<number>[g|m|k] \
               errorRate=0.X \
              [other-options] \
              [-pacbio-raw | -pacbio-corrected | -nanopore-raw | -nanopore-corrected] *fastq
  
    By default, all three stages (correct, trim, assemble) are computed.
    To compute only a single stage, use:
      -correct       - generate corrected reads
      -trim          - generate trimmed reads
      -assemble      - generate an assembly
      -trim-assemble - generate trimmed reads and then assemble them
  
    The assembly is computed in the (created) -d <assembly-directory>, with most
    files named using the -p <assembly-prefix>.
  
    The genome size is your best guess of the genome size of what is being assembled.
    It is used mostly to compute coverage in reads.  Fractional values are allowed: '4.7m'
    is the same as '4700k' and '4700000'
  
    The errorRate is not used correctly (we're working on it).  Don't set it
    If you want to change the defaults, use the various utg*ErrorRate options.
  
    A full list of options can be printed with '-options'.  All options
    can be supplied in an optional sepc file.
  
    Reads can be either FASTA or FASTQ format, uncompressed, or compressed
    with gz, bz2 or xz.  Reads are specified by the technology they were
    generated with:
      -pacbio-raw         <files>
      -pacbio-corrected   <files>
      -nanopore-raw       <files>
      -nanopore-corrected <files>
  
  Complete documentation at http://canu.readthedocs.org/en/latest/
  
  ERROR:  Directory not supplied with -d.
  ERROR:  Paramter 'correctedErrorRate' is not known.
  ERROR:  Paramter 'rawErrorRate' is not known.
executor >  local (3)
[-        ] process > get_software_versions                      -
[-        ] process > fastqc (N16005_DDMS210004243-1a_HFMWLDSX2) -
[25/884a7c] process > canu (N16005_Nanopore)                     [100%] 1 of 1, failed: 1 ✘
[-        ] process > minimap                                    -
[-        ] process > pilon                                      -
[-        ] process > quast_canu                                 -
[-        ] process > multiqc                                    -
Error executing process > 'canu (N16005_Nanopore)'

Caused by:
  Process `canu (N16005_Nanopore)` terminated with an error exit status (1)

Command executed:

  canu \
  -p N16005_Nanopore genomeSize=2.8m -nanopore-raw N16005_Nanopore.fastq gnuplotTested=true \
  correctedErrorRate=0.144 \
  rawErrorRate=0.500 \
  minReadLength=1000 \
  minOverlapLength=500

Command exit status:
  1

Command output:
  
  usage: canu [-correct | -trim | -assemble | -trim-assemble] \
              [-s <assembly-specifications-file>] \
               -p <assembly-prefix> \
               -d <assembly-directory> \
               genomeSize=<number>[g|m|k] \
               errorRate=0.X \
              [other-options] \
              [-pacbio-raw | -pacbio-corrected | -nanopore-raw | -nanopore-corrected] *fastq
  
    By default, all three stages (correct, trim, assemble) are computed.
    To compute only a single stage, use:
      -correct       - generate corrected reads
      -trim          - generate trimmed reads
      -assemble      - generate an assembly
      -trim-assemble - generate trimmed reads and then assemble them
  
    The assembly is computed in the (created) -d <assembly-directory>, with most
    files named using the -p <assembly-prefix>.
  
    The genome size is your best guess of the genome size of what is being assembled.
    It is used mostly to compute coverage in reads.  Fractional values are allowed: '4.7m'
    is the same as '4700k' and '4700000'
  
    The errorRate is not used correctly (we're working on it).  Don't set it
    If you want to change the defaults, use the various utg*ErrorRate options.
  
    A full list of options can be printed with '-options'.  All options
    can be supplied in an optional sepc file.
  
    Reads can be either FASTA or FASTQ format, uncompressed, or compressed
    with gz, bz2 or xz.  Reads are specified by the technology they were
    generated with:
      -pacbio-raw         <files>
      -pacbio-corrected   <files>
      -nanopore-raw       <files>
      -nanopore-corrected <files>
  
  Complete documentation at http://canu.readthedocs.org/en/latest/
  
  ERROR:  Directory not supplied with -d.
  ERROR:  Paramter 'correctedErrorRate' is not known.
  ERROR:  Paramter 'rawErrorRate' is not known.

Command wrapper:
  
  usage: canu [-correct | -trim | -assemble | -trim-assemble] \
              [-s <assembly-specifications-file>] \
               -p <assembly-prefix> \
               -d <assembly-directory> \
               genomeSize=<number>[g|m|k] \
               errorRate=0.X \
              [other-options] \
              [-pacbio-raw | -pacbio-corrected | -nanopore-raw | -nanopore-corrected] *fastq
  
    By default, all three stages (correct, trim, assemble) are computed.
    To compute only a single stage, use:
      -correct       - generate corrected reads
      -trim          - generate trimmed reads
      -assemble      - generate an assembly
      -trim-assemble - generate trimmed reads and then assemble them
  
    The assembly is computed in the (created) -d <assembly-directory>, with most
    files named using the -p <assembly-prefix>.
  
    The genome size is your best guess of the genome size of what is being assembled.
    It is used mostly to compute coverage in reads.  Fractional values are allowed: '4.7m'
    is the same as '4700k' and '4700000'
  
    The errorRate is not used correctly (we're working on it).  Don't set it
    If you want to change the defaults, use the various utg*ErrorRate options.
  
    A full list of options can be printed with '-options'.  All options
    can be supplied in an optional sepc file.
  
    Reads can be either FASTA or FASTQ format, uncompressed, or compressed
    with gz, bz2 or xz.  Reads are specified by the technology they were
    generated with:
      -pacbio-raw         <files>
      -pacbio-corrected   <files>
      -nanopore-raw       <files>
      -nanopore-corrected <files>
  
  Complete documentation at http://canu.readthedocs.org/en/latest/
  
  ERROR:  Directory not supplied with -d.
  ERROR:  Paramter 'correctedErrorRate' is not known.
  ERROR:  Paramter 'rawErrorRate' is not known.

Work dir:
  /data01/nextflow_test/1_Raw_data/work/25/884a7c241f4f68374c5fc59fcf6d14

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command.sh`