cancerit/PCAP-core

bam_stats - passthrough (CRAM)

keiranmraine opened this issue · 2 comments

If CRAM is set as the output format we currently have to run bam_stats as a separate process after the CRAM file is written to disk (scramble used for easily tuning of cram compression). For BAM files we are able to have bam_stats read from tee'ed output.

This isn't possible due to the combination of commands used for CRAM, but if bam_stats was able to pass the input data direct to stdout when an output file for the BAS data is provided and a passthrough flag is enabled we could insert this into the pipeline and save a disk read.

Under cram the read and process from disk is pretty heavyweight compared to having bam_stats read from an uncompressed stream.

Possibly worth adding thread pool for decompression threads as part of this.

... looks like its possible to work around this with the example provided by rob:

samtools/samtools#774 (comment)

This has been handled for bwa_mem.pl (which is what this was referencing)