[Bug]: Error executing process > 'pipeline:report (1)' when using sample_sheet

Question

[Bug]: Error executing process > 'pipeline:report (1)' when using sample_sheet

davidbante opened this issue a year ago · 3 comments

What happened?

When providing a sample_sheet to wf-artic, pipeline:report terminates with an error. All files are nonetheless written to the output directory as expected, except for the HTML report. After omitting the sample_sheet, the workflow finishes just fine.

Initially I thought it might be related to issue #79 , but the error still persists.

Command: nextflow run epi2me-labs/wf-artic --fastq 01_minknowguppy6.4.6_fastq/fastq_pass/ --scheme_version Midnight-ONT/V3 --out_dir 02_artic-out --artic_threads 12 --pangolin_threads 12 --update_data -c ~/seqresults/nextflow_config.cfg --sample_sheet ../sc2_samplesheets/20230508_samplesheet.csv

For more:
20230531_nextflow_samplesheet_fail.txt

Operating System

ubuntu 20.04

Workflow Execution

Command line

Workflow Execution - EPI2ME Labs Versions

No response

Workflow Execution - CLI Execution Profile

Docker

Workflow Version

v0.3.28-g428300f

Relevant log output

ERROR ~ Error executing process > 'pipeline:report (1)'

Caused by:
  Process `pipeline:report (1)` terminated with an error exit status (1)

Command executed:

  echo "--pangolin pangolin.csv"
      echo "--nextclade nextclade.json"
      echo '[
      {
          "barcode": "barcode66",
          "type": "test_sample",
          "alias": "55267002"
      },
      {
          "barcode": "barcode63",
          "type": "test_sample",
          "alias": "41417806"
      },
      {
          "barcode": "barcode58",
          "type": "test_sample",
          "alias": "42219738"
      },
      {
          "barcode": "barcode65",
          "type": "test_sample",
          "alias": "70105391"
      },
      {
          "barcode": "barcode85",
          "type": "test_sample",
          "alias": "G43-1_170223"
      },
      {
          "barcode": "barcode71",
          "type": "test_sample",
          "alias": "D14-2_210521"
      },
      {
          "barcode": "barcode91",
          "type": "test_sample",
          "alias": "G57-3_280323"
      },
      {
          "barcode": "barcode59",
          "type": "test_sample",
          "alias": "40922561"
      },
      {
          "barcode": "barcode95",
          "type": "test_sample",
          "alias": "F73-3_091222"
      },
      {
          "barcode": "barcode89",
          "type": "test_sample",
          "alias": "G57-1_100323"
      },
      {
          "barcode": "barcode60",
          "type": "test_sample",
          "alias": "43035961"
      },
      {
          "barcode": "barcode81",
          "type": "test_sample",
          "alias": "G15-2_020123"
      },
      {
          "barcode": "barcode61",
          "type": "test_sample",
          "alias": "55930808"
      },
      {
          "barcode": "barcode68",
          "type": "test_sample",
          "alias": "PER400064671"
      },
      {
          "barcode": "barcode57",
          "type": "test_sample",
          "alias": "40944667"
      },
      {
          "barcode": "barcode83",
          "type": "test_sample",
          "alias": "G37-3_140223"
      },
      {
          "barcode": "barcode64",
          "type": "test_sample",
          "alias": "41775697"
      },
      {
          "barcode": "barcode70",
          "type": "test_sample",
          "alias": "H23-09370T"
      },
      {
          "barcode": "barcode62",
          "type": "test_sample",
          "alias": "PER400116966"
      },
      {
          "barcode": "barcode90",
          "type": "test_sample",
          "alias": "G57-2_100323"
      },
      {
          "barcode": "barcode93",
          "type": "test_sample",
          "alias": "G60-1_060323"
      },
      {
          "barcode": "barcode72",
          "type": "test_sample",
          "alias": "E73-1_200522"
      },
      {
          "barcode": "barcode84",
          "type": "test_sample",
          "alias": "G41-2_210223"
      },
      {
          "barcode": "barcode82",
          "type": "test_sample",
          "alias": "G15-3_020123"
      },
      {
          "barcode": "barcode86",
          "type": "test_sample",
          "alias": "G52-1_100323"
      },
      {
          "barcode": "barcode67",
          "type": "test_sample",
          "alias": "PER400273007"
      },
      {
          "barcode": "barcode94",
          "type": "test_sample",
          "alias": "G64-1_030423"
      },
      {
          "barcode": "barcode87",
          "type": "test_sample",
          "alias": "G52-2_180323"
      },
      {
          "barcode": "barcode92",
          "type": "test_sample",
          "alias": "G60_OM"
      },
      {
          "barcode": "barcode69",
          "type": "test_sample",
          "alias": "PER400106437"
      },
      {
          "barcode": "barcode73",
          "type": "test_sample",
          "alias": "F96-2_131222"
      },
      {
          "barcode": "barcode88",
          "type": "test_sample",
          "alias": "G53-1_060323"
      }
  ]' > metadata.json
      workflow-glue report         consensus_status.txt wf-artic-report.html         --pangolin pangolin.csv           --nextclade nextclade.json         --nextclade_errors consensus.errors.csv         --revision master         --commit 428300f74d76e4af18048799db37beb754ec6475         --min_len 150         --max_len 1200         --report_depth 100         --depths depth_stats/*         --fastcat_stats per-read-stats.tsv         --bcftools_stats vcf_stats/*          --versions versions         --params params.json         --consensus_fasta consensus_fasta         --metadata metadata.json

Command exit status:
  1

Command output:
  --pangolin pangolin.csv
  --nextclade nextclade.json

Command error:
  --pangolin pangolin.csv
  --nextclade nextclade.json
  /home/epi2melabs/conda/lib/python3.8/site-packages/pkg_resources/__init__.py:121: DeprecationWarning: pkg_resources is deprecated as an API
    warnings.warn("pkg_resources is deprecated as an API", DeprecationWarning)
  /home/epi2melabs/conda/lib/python3.8/site-packages/pkg_resources/__init__.py:2870: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('google')`.
  Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
    declare_namespace(pkg)
  /home/epi2melabs/conda/lib/python3.8/site-packages/pkg_resources/__init__.py:2870: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('mpl_toolkits')`.
  Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
    declare_namespace(pkg)
  /home/epi2melabs/conda/lib/python3.8/site-packages/pkg_resources/__init__.py:2870: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('google')`.
  Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
    declare_namespace(pkg)
  [11:38:00 - workflow_glue] Starting entrypoint.
  /home/nanouser/.nextflow/assets/epi2me-labs/wf-artic/bin/workflow_glue/report.py:96: DtypeWarning: Columns (2) have mixed types. Specify dtype option on import or set low_memory=False.
    seq_summary = pd.read_csv(args.fastcat_stats, delimiter="\t")
  Traceback (most recent call last):
    File "/home/nanouser/.nextflow/assets/epi2me-labs/wf-artic/bin/workflow-glue", line 7, in <module>

Answer 1 · 2023-06-09T14:48:53.000Z

Thanks for this @davidbante,

I think this is due to having a mixture of sample names that are integers and strings e.g 40944667 and G37-3_140223 - Pandas is reporting that this is resulting in a column with mixed data types.

We'll fix this, but in the meantime please add something to the integer sample names like sample_40944667 and see if that solves your problem.

Thanks

Matt

Answer 2 · 2023-06-12T08:30:36.000Z

Hi Matt,

It used to also work with mixed integer and string sample names in the past. Now, all string sample names work.
Thanks for your help!

David

Answer 3 · 2023-06-12T08:53:33.000Z

Thanks @davidbante,

We'll fix! Thanks again... I'll close for now