[Bug]: Error executing process > 'pipeline:report (1)' when using sample_sheet
davidbante opened this issue · 3 comments
What happened?
When providing a sample_sheet to wf-artic, pipeline:report
terminates with an error. All files are nonetheless written to the output directory as expected, except for the HTML report. After omitting the sample_sheet, the workflow finishes just fine.
Initially I thought it might be related to issue #79 , but the error still persists.
Command: nextflow run epi2me-labs/wf-artic --fastq 01_minknowguppy6.4.6_fastq/fastq_pass/ --scheme_version Midnight-ONT/V3 --out_dir 02_artic-out --artic_threads 12 --pangolin_threads 12 --update_data -c ~/seqresults/nextflow_config.cfg --sample_sheet ../sc2_samplesheets/20230508_samplesheet.csv
For more:
20230531_nextflow_samplesheet_fail.txt
Operating System
ubuntu 20.04
Workflow Execution
Command line
Workflow Execution - EPI2ME Labs Versions
No response
Workflow Execution - CLI Execution Profile
Docker
Workflow Version
v0.3.28-g428300f
Relevant log output
ERROR ~ Error executing process > 'pipeline:report (1)'
Caused by:
Process `pipeline:report (1)` terminated with an error exit status (1)
Command executed:
echo "--pangolin pangolin.csv"
echo "--nextclade nextclade.json"
echo '[
{
"barcode": "barcode66",
"type": "test_sample",
"alias": "55267002"
},
{
"barcode": "barcode63",
"type": "test_sample",
"alias": "41417806"
},
{
"barcode": "barcode58",
"type": "test_sample",
"alias": "42219738"
},
{
"barcode": "barcode65",
"type": "test_sample",
"alias": "70105391"
},
{
"barcode": "barcode85",
"type": "test_sample",
"alias": "G43-1_170223"
},
{
"barcode": "barcode71",
"type": "test_sample",
"alias": "D14-2_210521"
},
{
"barcode": "barcode91",
"type": "test_sample",
"alias": "G57-3_280323"
},
{
"barcode": "barcode59",
"type": "test_sample",
"alias": "40922561"
},
{
"barcode": "barcode95",
"type": "test_sample",
"alias": "F73-3_091222"
},
{
"barcode": "barcode89",
"type": "test_sample",
"alias": "G57-1_100323"
},
{
"barcode": "barcode60",
"type": "test_sample",
"alias": "43035961"
},
{
"barcode": "barcode81",
"type": "test_sample",
"alias": "G15-2_020123"
},
{
"barcode": "barcode61",
"type": "test_sample",
"alias": "55930808"
},
{
"barcode": "barcode68",
"type": "test_sample",
"alias": "PER400064671"
},
{
"barcode": "barcode57",
"type": "test_sample",
"alias": "40944667"
},
{
"barcode": "barcode83",
"type": "test_sample",
"alias": "G37-3_140223"
},
{
"barcode": "barcode64",
"type": "test_sample",
"alias": "41775697"
},
{
"barcode": "barcode70",
"type": "test_sample",
"alias": "H23-09370T"
},
{
"barcode": "barcode62",
"type": "test_sample",
"alias": "PER400116966"
},
{
"barcode": "barcode90",
"type": "test_sample",
"alias": "G57-2_100323"
},
{
"barcode": "barcode93",
"type": "test_sample",
"alias": "G60-1_060323"
},
{
"barcode": "barcode72",
"type": "test_sample",
"alias": "E73-1_200522"
},
{
"barcode": "barcode84",
"type": "test_sample",
"alias": "G41-2_210223"
},
{
"barcode": "barcode82",
"type": "test_sample",
"alias": "G15-3_020123"
},
{
"barcode": "barcode86",
"type": "test_sample",
"alias": "G52-1_100323"
},
{
"barcode": "barcode67",
"type": "test_sample",
"alias": "PER400273007"
},
{
"barcode": "barcode94",
"type": "test_sample",
"alias": "G64-1_030423"
},
{
"barcode": "barcode87",
"type": "test_sample",
"alias": "G52-2_180323"
},
{
"barcode": "barcode92",
"type": "test_sample",
"alias": "G60_OM"
},
{
"barcode": "barcode69",
"type": "test_sample",
"alias": "PER400106437"
},
{
"barcode": "barcode73",
"type": "test_sample",
"alias": "F96-2_131222"
},
{
"barcode": "barcode88",
"type": "test_sample",
"alias": "G53-1_060323"
}
]' > metadata.json
workflow-glue report consensus_status.txt wf-artic-report.html --pangolin pangolin.csv --nextclade nextclade.json --nextclade_errors consensus.errors.csv --revision master --commit 428300f74d76e4af18048799db37beb754ec6475 --min_len 150 --max_len 1200 --report_depth 100 --depths depth_stats/* --fastcat_stats per-read-stats.tsv --bcftools_stats vcf_stats/* --versions versions --params params.json --consensus_fasta consensus_fasta --metadata metadata.json
Command exit status:
1
Command output:
--pangolin pangolin.csv
--nextclade nextclade.json
Command error:
--pangolin pangolin.csv
--nextclade nextclade.json
/home/epi2melabs/conda/lib/python3.8/site-packages/pkg_resources/__init__.py:121: DeprecationWarning: pkg_resources is deprecated as an API
warnings.warn("pkg_resources is deprecated as an API", DeprecationWarning)
/home/epi2melabs/conda/lib/python3.8/site-packages/pkg_resources/__init__.py:2870: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('google')`.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
declare_namespace(pkg)
/home/epi2melabs/conda/lib/python3.8/site-packages/pkg_resources/__init__.py:2870: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('mpl_toolkits')`.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
declare_namespace(pkg)
/home/epi2melabs/conda/lib/python3.8/site-packages/pkg_resources/__init__.py:2870: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('google')`.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
declare_namespace(pkg)
[11:38:00 - workflow_glue] Starting entrypoint.
/home/nanouser/.nextflow/assets/epi2me-labs/wf-artic/bin/workflow_glue/report.py:96: DtypeWarning: Columns (2) have mixed types. Specify dtype option on import or set low_memory=False.
seq_summary = pd.read_csv(args.fastcat_stats, delimiter="\t")
Traceback (most recent call last):
File "/home/nanouser/.nextflow/assets/epi2me-labs/wf-artic/bin/workflow-glue", line 7, in <module>
Thanks for this @davidbante,
I think this is due to having a mixture of sample names that are integers and strings e.g 40944667
and G37-3_140223
- Pandas is reporting that this is resulting in a column with mixed data types.
We'll fix this, but in the meantime please add something to the integer sample names like sample_40944667
and see if that solves your problem.
Thanks
Matt
Hi Matt,
It used to also work with mixed integer and string sample names in the past. Now, all string sample names work.
Thanks for your help!
David
Thanks @davidbante,
We'll fix! Thanks again... I'll close for now