muellan/metacache

Support for gzipped reads

donovan-h-parks opened this issue ยท 12 comments

Hi.

Are there plans to support reads in compressed gzipped format (i.e. my_reads.fq.gz)? This would be a major help for incorporating MetaCache into workflows.

Cheers,
Donovan

Yes, it's on my list for the next major version, but we are currently ironing out some bugs and also working on some improvements. So, it may take some time.

Hi. Thank you for the quick response. In my testing, MetaCache is certainly among the best performing classifiers available. Are any of the upcoming bug fixes critical?

There's currently a bug that was introduced in the last version. It leads to unnecessarily high memory consumption during database builds. The fix is already implemented and will be released shortly.
There are some other minor things, nothing that would affect the classification results.

Thanks. I'm currently using v0.9.0 so perhaps have avoided these issues.

Yes, gzip fastq compatibility would be very useful for me as well.

Just to let you all know that the next version of Metacache does support reading gzipped sequence files.
Since it also contains a large portion of new code for accelerating builds and querying it might take a few weeks until we will release it.

Nice, thanks!

Is there any ETA on when the new release with the gzipped version will be out?

It would also be important for incorporation into my workflows as well.

We currently have a paper under review. We will make the code of the latest version which also supports reading gzipped files (and many more capabilities) available as soon as the paper is accepted (fingers crossed). Unfortunately we don't have the time to back-port the reading of compressed files to an older version at the moment. So it will likely take a few weeks until we can make the newest version public.

No problem, good to know paper is under review! Good luck, and looking forward to it!

Reading gzipped files is now supported in the latest release!

Woohoo thank you!