mbhall88/rasusa

Compress output when requested

mbhall88 opened this issue · 5 comments

Hey @natir.

Yes, I only just switched the parsing to needletail. Given I only just switched the parser I probably won't get around to switching it again anytime soon. Also, compile time isn't a major concern for me, especially since I distribute pre-compiled binaries and a bunch of other methods that mean users don't need to compile the project. I'm happy to review a PR with updated benchmark though.

Regarding niffler, you've made me realise somewhere along the line I have lost the compressed output functionality of this tool... Originally rasusa would infer the desired output compression from the path. I'll have to fix that.

Originally posted by @mbhall88 in #25 (comment)

It might also be a good idea to add a flag to allow the user to set the compression level also.

natir commented

I think, keeping the same compression format as input is a good default behavior for the user, but add option to choose another one is important too.

Hello @mbhall88, thanks for developing and maintaining rasusa.

Sorry for stepping in, I am not sure whether this is the right thread or my issue is related to what is being discussed here.
According to the documentation, the output can be automatically compressed if .gz is stated in the output path during submission. However, the resulting paired-end fastq files are uncompressed. Am I missing any argument/flag?

This is the code I am using:

audald/software/miniconda3/bin/rasusa -i sample_R1.fastq.gz sample_R2.fastq.gz --coverage 0.25 --genome-size 2715853792b -o sample_out_R1.fastq.gz sample_out_R2.fastq.gz -s 189

Thanks in advance!

Hey @Audald, yes this is the exact problem this issue describes. I don't know how I removed this functionality. I will get a fix out ASAP sorry.

Thanks for your prompt answer, @mbhall88. I can bypass the issue by adding a bgzip step in my pipeline. Therefore, there is no urgency from my end. Best regards and thanks again for the great work.

This should now be fixed in version 0.5.0. I've also added some new compression CLI options