Adding bgzip integration?
jelber2 opened this issue · 3 comments
Hi,
Not really an issue, more of a request?
bgzip
can decompress in parallel (assuming the .gz file was compressed with bgzip
). I don't know of many projects that take advantage of this, but first checking whether a .gz file was compressed with bgzip
then calling the bgzip
binary if present or perhaps a library somehow included with Seq
(see for example something similar done with python
https://pypi.org/project/bgzip/) might be very interesting. BBTools/BBMap
(https://sourceforge.net/projects/bbmap/) can take advantage of systems where bgzip
is installed, and I have seen quite a big performance increase when using bgzip
on bgzipped files.
Thanks for the suggestion!
In case you need it urgently, you can probably dynamically load C bzgip library via cimport
and then use the underlying C API.
Thanks! I might give that a try.
FYI, https://github.com/seq-lang/seq/blob/master/stdlib/core/file.seq contains a standard file / gzip implementation. We also use underlying C API directly there.