Missing barcodes from sample input break the pipeline
ammaraziz opened this issue · 3 comments
Operating System
Ubuntu 22.04
Other Linux
No response
Workflow Version
v0.3.31-gcb0ad4a
Workflow Execution
Command line
EPI2ME Version
No response
CLI command run
nextflow run epi2me-labs/wf-artic -c covid.cfg --fastq data/ --out_dir results/ \
--custom_scheme .../custom_scheme/V1/ --sample_sheet data/barcodes.csv \
--update_data --normalise 400 --basecaller_cfg dna_r10.4.1_e8.2_400bps_hac \
--min_len 150 --max_len 1200
Sample sheet (example):
barcode | sample_id | alias | type |
---|---|---|---|
barcode1 | X1 | X1 | test_sample |
barcode9 | X2 | X2 | test_sample |
barcode12 | X3 | X3 | test_sample |
... |
Note the barcode1
should be barcode01
Workflow Execution - CLI Execution Profile
None
What happened?
When running the pipeline with custom primers and a borked sample input sheet with missing barcode entries, the pipeline errors out. It seems like the missing barcode entries are causing issues in the pipeline at this part in the code:
https://github.com/epi2me-labs/wf-artic/blob/master/main.nf#L446C17-L446C103
Relevant log output
Checking fastq input.
executor > local (124)
[01/340b8e] process > validate_sample_sheet [100%] 1 of 1 ✔
[14/e71e19] process > fastcat (86) [100%] 87 of 87 ✔
[7d/bbfcfe] process > pipeline:getVersions [100%] 1 of 1 ✔
[25/e6be23] process > pipeline:getParams [100%] 1 of 1 ✔
[4e/944042] process > pipeline:lookup_medaka_variant_model (1) [100%] 1 of 1 ✔
[ec/189333] process > pipeline:runArtic (32) [ 0%] 0 of 87
[- ] process > pipeline:combineDepth -
[- ] process > pipeline:allConsensus -
[- ] process > pipeline:allVariants -
[68/736f76] process > pipeline:prep_nextclade [100%] 1 of 1 ✔
[- ] process > pipeline:nextclade -
[- ] process > pipeline:pangolin -
[- ] process > pipeline:report -
[- ] process > output -
WARN: Input directory 'barcode01' was found, but sample sheet 'data/tvr_barcodes.csv' has no such entry.
WARN: Input directory 'barcode06' was found, but sample sheet 'data/tvr_barcodes.csv' has no such entry.
WARN: Input directory 'barcode03' was found, but sample sheet 'data/tvr_barcodes.csv' has no such entry.
WARN: Input directory 'barcode05' was found, but sample sheet 'data/tvr_barcodes.csv' has no such entry.
WARN: Input directory 'barcode02' was found, but sample sheet 'data/tvr_barcodes.csv' has no such entry.
WARN: Input directory 'barcode08' was found, but sample sheet 'data/tvr_barcodes.csv' has no such entry.
WARN: Input directory 'barcode07' was found, but sample sheet 'data/tvr_barcodes.csv' has no such entry.
WARN: Input directory 'barcode04' was found, but sample sheet 'data/tvr_barcodes.csv' has no such entry.
WARN: Input directory 'barcode09' was found, but sample sheet 'data/tvr_barcodes.csv' has no such entry.
ERROR ~ Cannot invoke method resolve() on null object
-- Check script '~epi2me-labs/wf-artic/main.nf' at line: 446 or see '.nextflow.log' file for more details
Application activity log entry
No response
Related to this, the example sample_sheet.csv
has the wrong header:
https://github.com/epi2me-labs/wf-artic/blob/master/test_data/sample_sheet.csv
sample_id
should be sample_name
There's a possibility this is actually cause the above issue. Running the pipeline now to confirm.
Thanks for this - we'll take a look at both the issues you describe above.
Matt
I think this issue is not related to the barcode.csv but to fastcat
failing. I will investigate and open up an issue for fastcat
.
Thanks for your support Matt.