UMCUGenetics/NextflowModules

Feedback

Closed this issue · 1 comments

  1. Add an "how to update" to the documentation to explain how to also update the submodules.

  2. Documentation refers to the HPCwiki for info about singularity with slurm but the wiki page does not mention singularity.

  3. Does not take *.fq.gz as input files only *.fastq.gz

  4. Requires "R1" to be present in the fastq filename even if it is single end data.

  5. Part of the input fastq filename is nog longer there in the output filename. The documentation should state the format of the input fastq filename that is expected.

  6. featureCounts does not assign any reads for single end data. All reads are status "Unassigned_Read_Type". Might be due to strandedness of my data and the default setting not matching. I will test this and update with results.

Hi Yano, thanks for the feedback. I will try to answer each point below.

  1. Updating submodules can be done according to the coding guidelines in the readme: https://github.com/UMCUGenetics/NextflowModules#contributing

  2. The hpc wiki explains singularity on this page https://wiki.bioinformatics.umcutrecht.nl/bin/view/HPC/SoftwareInstallation#Making_software_available_us_AN2, singularity is independent from slurm or sge.

  3. Correct, utils/fastq.nf module only supports fastq.gz for now, feel free to submit a pull request. I think we can use something like: f*q.gz to support both file types.

  4. extractFastqFromDir in utils/fastq.nf supports single end data.

  5. This depends on which tools and workflow are used to process your data. But we could indeed explain what the processes in utils/fastq.nf need in terms of input file names.

  6. This seems something workflow specific? Please add an issue to the repository of the workflow.

I am closing the issue now, feel free to reopen it if you have more questions.