nf-core/hlatyping

No cat_fastq? Sample with multiple run are not handled

Opened this issue · 2 comments

Description of the bug

Just noticed that when the pipeline is run with data from multiple libraries/runs, the pipeline produces results for each library/line separately which is a bit odd behavior. But you might know that such scenarios are handled in other pipelines such rnaseq and rnavar and even sarek.

Looks like the cat_fastq feature is not implemented in this pipeline.

Any specific reason for not including this feature?

Command used and terminal output

**Input sample_sheet.csv**


sample,fastq_1,fastq_2,seq_type
124127_265,/storage1/124127_265_S206_L002_R1_001.fastq.gz,/storage1/124127_265_S206_L002_R2_001.fastq.gz,rna
124127_265,/storage1/124127_265_S206_L003_R1_001.fastq.gz,/storage1/124127_265_S206_L003_R2_001.fastq.gz,rna
124127_265,/storage1/124127_265_S148_L004_R1_001.fastq.gz,/storage1/124127_265_S148_L004_R2_001.fastq.gz,rna
124127_265,/storage1/124127_265_S148_L003_R1_001.fastq.gz,/storage1/124127_265_S148_L003_R2_001.fastq.gz,rna

Output:

│   ├── 124127_265_T1
│   │   ├── 124127_265_T1_coverage_plot.pdf
│   │   └── 124127_265_T1_result.tsv
│   ├── 124127_265_T2
│   │   ├── 124127_265_T2_coverage_plot.pdf
│   │   └── 124127_265_T2_result.tsv
│   ├── 124127_265_T3
│   │   ├── 124127_265_T3_coverage_plot.pdf
│   │   └── 124127_265_T3_result.tsv
│   ├── 124127_265_T4
│   │   ├── 124127_265_T4_coverage_plot.pdf
│   │   └── 124127_265_T4_result.tsv


### Relevant files

_No response_

### System information

_No response_

Hi @praveenraj2018, thanks for reporting this. To me this is not a bug and expected behaviour for the HLA genotyping pipeline but of course it is worth discussing if that behaviour should be changed, also in order to make it more consistent with other nf-core pipelines.

Hi @christopher-mohr, when will this enhancement be available? I'm also interested in using cat_fastq for hlatyping. Thank you!