OpenOmics/genome-seek

Calling only -> no alignment

Opened this issue · 2 comments

Hi,

I wanted to use your package, but ran into the issue that I do not want to realign my data - would it be possible to integrate a shortcut to skip alignment as well?

or is their something spacial done within the alignment, on which the rest is building?

If so, I had the impression that my multi-lane fastq was not accepted properly.

Could you set something up with an addition of *L{X}*R{1,2}.fastq.gz?

Cheers!

Hey @dansteiert,

The pipeline currently accepts FastQ files as input. This is to ensure the data has been properly aligned/recalibrated against the supported reference genome. This also acts as a safeguard to ensure it is compatible with any downstream steps.

With that being said, trimming and alignment will add some extra processing time but it's worth it. We cannot guarantee the accuracy of the results if you start with your own bam files.

If your current fastq files are split by lane (due to multiplexing across multiple flowcell lanes), please merge them prior to running the pipeline.

I hope this helps.

Best regards,
@skchronicles

Hi @skchronicles,
thank you for your input.

I understand the notion to keep it all within a single workflow, such that their is no confunding factors within the results.

Nevertheless I think it might be usefull for some users to proceed with their own bam file creation.
And basically request them to do things like a GATK best practises workflow type preprocessing.
With new and old technologies their might be always things needed to be performed, e.g. incorporating UMI tagged reads.

Just something to consider ;)
Cheers!