epi2me-labs/wf-artic

[Bug]: Error executing process > 'handleSingleFile

Rohit-Satyam opened this issue · 8 comments

What happened?

Workflow unable to take fastq or fastq.gz file as an input

Operating System

ubuntu 20.04

Workflow Execution

Command line

Workflow Execution - EPI2ME Labs Versions

No response

Workflow Execution - Execution Profile

Docker

Workflow Version

v0.3.18

Relevant log output

nextflow run epi2me-labs/wf-artic --fastq ~/Documents/COVID_Project/longread/dataset2/results/01_read_filtering/0055219_barcode49.fastq --scheme_version ARTIC/V4.1 --out_dir dataset2_output
N E X T F L O W  ~  version 21.10.6
Launching `epi2me-labs/wf-artic` [awesome_poitras] - revision: a60a1e1e73 [master]

WARN: Found unexpected parameters:
* --scheme_dir: primer_schemes
- Ignore this warning: params.schema_ignore_params = "scheme_dir" 

Core Nextflow options
  revision       : master
  runName        : awesome_poitras
  containerEngine: docker
  launchDir      : /home/subudhak/Documents/COVID_Project/longread/dataset2/results
  workDir        : /home/subudhak/Documents/COVID_Project/longread/dataset2/results/work
  projectDir     : /home/subudhak/.nextflow/assets/epi2me-labs/wf-artic
  userName       : subudhak
  profile        : standard
  configFiles    : /home/subudhak/.nextflow/assets/epi2me-labs/wf-artic/nextflow.config

Basic Input/Output Options
  out_dir        : dataset2_output
  fastq          : /home/subudhak/Documents/COVID_Project/longread/dataset2/results/01_read_filtering/0055219_barcode49.fastq

Primer Scheme Selection
  scheme_version : ARTIC/V4.1

Advanced options
  normalise      : 200

!! Only displaying parameters that differ from the pipeline defaults !!
------------------------------------------------------
If you use epi2me-labs/wf-artic for your analysis please cite:

* The nf-core framework
  https://doi.org/10.1038/s41587-020-0439-x




      ------------------------------------
      Available Primer Schemes:
      ------------------------------------
      
  Name		Version
  spike-seq	ONT/V1	
  spike-seq	ONT/V4.1	
  SARS-CoV-2	NEB-VarSkip/v2b	
  SARS-CoV-2	NEB-VarSkip/v2	
  SARS-CoV-2	NEB-VarSkip/v1a-long	
  SARS-CoV-2	NEB-VarSkip/v1a	
  SARS-CoV-2	Midnight-IDT/V1	
  SARS-CoV-2	ARTIC/V2	
  SARS-CoV-2	ARTIC/V1	
  SARS-CoV-2	ARTIC/V4	
  SARS-CoV-2	ARTIC/V4.1	
  SARS-CoV-2	ARTIC/V3	
  SARS-CoV-2	Midnight-ONT/V2	
  SARS-CoV-2	Midnight-ONT/V1	
  SARS-CoV-2	Midnight-ONT/V3	

      ------------------------------------
      
Checking fastq input.
Single file input detected.
executor >  local (5)
[c3/60f565] process > handleSingleFile (1)    [100%] 1 of 1, failed: 1 ✘
[3d/0aeed3] process > pipeline:getVersions    [  0%] 0 of 1
[65/94cee2] process > pipeline:getParams      [  0%] 0 of 1
[26/9faa6b] process > pipeline:copySchemeDir  [  0%] 0 of 1
[-        ] process > pipeline:preArticQC     -
[-        ] process > pipeline:runArtic       -
[-        ] process > pipeline:combineDepth   -
[-        ] process > pipeline:allConsensus   -
[-        ] process > pipeline:allVariants    [  0%] 0 of 1
[45/37c873] process > pipeline:prep_nextclade [  0%] 0 of 1
[-        ] process > pipeline:nextclade      -
[-        ] process > pipeline:pangolin       -
[-        ] process > pipeline:telemetry      -
[-        ] process > pipeline:report         -
[-        ] process > output                  -
WARN: Input tuple does not match input set cardinality declared by process `pipeline:telemetry` -- offending value: [[]]
WARN: Input tuple does not match input set cardinality declared by process `pipeline:allVariants` -- offending value: [[]]
Error executing process > 'handleSingleFile (1)'

Caused by:
  Process `handleSingleFile (1)` terminated with an error exit status (127)

Command executed:

  mkdir 0055219_barcode49
  mv 0055219_barcode49.fastq 0055219_barcode49

Command exit status:
  127

Command output:
  (empty)

Command error:
  .command.run: line 279: docker: command not found

Work dir:
  /home/subudhak/Documents/COVID_Project/longread/dataset2/results/work/c3/60f565cf743de304ec4990011ed328

Tip: when you have fixed the problem you can continue the execution adding the option `-resume` to the run command line


WARN: Killing pending tasks (2)
cjw85 commented

The log above states:

  .command.run: line 279: docker: command not found

It appears that you are running with the default docker option, but do not have docker installed on your system. Please refer to our installation instructions: https://labs.epi2me.io/wfindex/#installation

I ran the pipeline with -profile conda and this time it ran few steps successfully but here comes the new error

(wfartic) subudhak@KW61216:~/Documents/test$ nextflow run epi2me-labs/wf-artic -profile conda --fastq ~/Documents/COVID_Project/longread/dataset2/results/01_read_filtering/0055219_barcode49.fastq --out_dir testing --scheme_version ARTIC/V4.1
N E X T F L O W  ~  version 22.04.5
Launching `https://github.com/epi2me-labs/wf-artic` [dreamy_wescoff] DSL2 - revision: a60a1e1e73 [master]

WARN: Found unexpected parameters:
* --scheme_dir: primer_schemes
- Ignore this warning: params.schema_ignore_params = "scheme_dir" 

Core Nextflow options
  revision      : master
  runName       : dreamy_wescoff
  launchDir     : /home/subudhak/Documents/test
  workDir       : /home/subudhak/Documents/test/work
  projectDir    : /home/subudhak/.nextflow/assets/epi2me-labs/wf-artic
  userName      : subudhak
  profile       : conda
  configFiles   : /home/subudhak/.nextflow/assets/epi2me-labs/wf-artic/nextflow.config

Basic Input/Output Options
  out_dir       : testing
  fastq         : /home/subudhak/Documents/COVID_Project/longread/dataset2/results/01_read_filtering/0055219_barcode49.fastq

Primer Scheme Selection
  scheme_version: ARTIC/V4.1

Advanced options
  normalise     : 200

!! Only displaying parameters that differ from the pipeline defaults !!
------------------------------------------------------
If you use epi2me-labs/wf-artic for your analysis please cite:

* The nf-core framework
  https://doi.org/10.1038/s41587-020-0439-x




      ------------------------------------
      Available Primer Schemes:
      ------------------------------------
      
  Name		Version
  spike-seq	ONT/V1	
  spike-seq	ONT/V4.1	
  SARS-CoV-2	NEB-VarSkip/v2b	
  SARS-CoV-2	NEB-VarSkip/v2	
  SARS-CoV-2	NEB-VarSkip/v1a-long	
  SARS-CoV-2	NEB-VarSkip/v1a	
  SARS-CoV-2	Midnight-IDT/V1	
  SARS-CoV-2	ARTIC/V2	
  SARS-CoV-2	ARTIC/V1	
  SARS-CoV-2	ARTIC/V4	
  SARS-CoV-2	ARTIC/V4.1	
  SARS-CoV-2	ARTIC/V3	
  SARS-CoV-2	Midnight-ONT/V2	
  SARS-CoV-2	Midnight-ONT/V1	
  SARS-CoV-2	Midnight-ONT/V3	

      ------------------------------------
      
Checking fastq input.
Single file input detected.
[-        ] process > handleSingleFile        -
executor >  local (6)
[8f/00286e] process > handleSingleFile (1)    [100%] 1 of 1 ✔
[fd/300385] process > pipeline:getVersions    [  0%] 0 of 1
[78/1c8cb4] process > pipeline:getParams      [100%] 1 of 1 ✔
[09/027dc7] process > pipeline:copySchemeDir  [  0%] 0 of 1
[17/715685] process > pipeline:preArticQC (1) [  0%] 0 of 1
[-        ] process > pipeline:runArtic       -
[-        ] process > pipeline:combineDepth   -
[-        ] process > pipeline:allConsensus   -
[-        ] process > pipeline:allVariants    -
[80/260e62] process > pipeline:prep_nextclade [100%] 1 of 1 ✔
[-        ] process > pipeline:nextclade      -
[-        ] process > pipeline:pangolin       -
executor >  local (6)
[8f/00286e] process > handleSingleFile (1)    [100%] 1 of 1 ✔
[fd/300385] process > pipeline:getVersions    [100%] 1 of 1, failed: 1 ✘
[78/1c8cb4] process > pipeline:getParams      [100%] 1 of 1 ✔
[-        ] process > pipeline:copySchemeDir  -
[-        ] process > pipeline:preArticQC (1) -
[-        ] process > pipeline:runArtic       -
[-        ] process > pipeline:combineDepth   -
[-        ] process > pipeline:allConsensus   -
[-        ] process > pipeline:allVariants    -
[80/260e62] process > pipeline:prep_nextclade [100%] 1 of 1 ✔
[-        ] process > pipeline:nextclade      -
[-        ] process > pipeline:pangolin       -
[-        ] process > pipeline:telemetry      -
[-        ] process > pipeline:report         -
[-        ] process > output                  -
Error executing process > 'pipeline:getVersions'

Caused by:
  Process `pipeline:getVersions` terminated with an error exit status (1)

Command executed:

  medaka --version | sed 's/ /,/' >> versions.txt
  minimap2 --version | sed 's/^/minimap2,/' >> versions.txt
  bcftools --version | head -n 1 | sed 's/ /,/' >> versions.txt
  samtools --version | head -n 1 | sed 's/ /,/' >> versions.txt
  artic --version | sed 's/ /,/' >> versions.txt

Command exit status:
  1

Command output:
  (empty)

Command error:
  Traceback (most recent call last):
    File "/home/subudhak/Documents/test/work/conda/epi2melabs-nf-artic-69fb201e5af3477012a411f0f53f1cd9/bin/medaka", line 7, in <module>
      from medaka.medaka import main
    File "/home/subudhak/Documents/test/work/conda/epi2melabs-nf-artic-69fb201e5af3477012a411f0f53f1cd9/lib/python3.8/site-packages/medaka/medaka.py", line 11, in <module>
      import medaka.models
    File "/home/subudhak/Documents/test/work/conda/epi2melabs-nf-artic-69fb201e5af3477012a411f0f53f1cd9/lib/python3.8/site-packages/medaka/models.py", line 7, in <module>
      import requests
    File "/home/subudhak/.local/lib/python3.8/site-packages/requests/__init__.py", line 44, in <module>
      import chardet
  ModuleNotFoundError: No module named 'chardet'

Work dir:
  /home/subudhak/Documents/test/work/fd/300385ca6639e88faa7e197c752f92

Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`

I tried installing chardet using mamba install -c conda-forge chardet but the workflow is unable to detect it.

Update: I tried installing docker and running the command again. The docker version works. But I wish to run this pipeline on cluster using conda where we don't have docker.

nextflow run epi2me-labs/wf-artic --fastq ~/Documents/COVID_Project/longread/dataset2/results/01_read_filtering/0055219_barcode49.fastq --out_dir testing --scheme_version ARTIC/V4.1
cjw85 commented

Do you have any other container runtime available on your cluster, such as singularity?

Unfortunately we've had various problems using conda with our workflows, and intend to deprecate it's use in favour of containers.

Yes, we have Singularity.

I was able to run the pipeline using singularity on cluster. One last query: I have multiple fastq.gz files. When I give entire directory containing multiple fastq files, I get a single BAM file. How can I get separate BAM and VCF files and consensus FASTA file. Using --fastq data/*.gz didn't work.

cjw85 commented

The workflow assumes a directory structure as would be output by the MinKNOW sequencing device software. So something like:

- fastq_pass
    - barcode01
        - *.fastq.gz
    - barcode02
        - *.fastq.gz
     - barcode03
         - *.fastq.gz

If you want to analysis fastq files independently you will need to place them in subfolders to the directory provided as the input to the workflow. In the above example passing --fastq fastq_pass would result in the three barcodeXX directories being analysed independently.

Thanks. My issue is resolved so I am closing this.