EMBL-PKU/BASALT

Question: Any plans to make the tool available at Biocontainers?

Opened this issue · 2 comments

This would make it easier to integrate into pipelines, e.g. nf-core.

https://biocontainers.pro/ https://github.com/BioContainers

Edit: When following the installation instructions, I notice a few other things:

  1. The location for the trained model (~/.cache/BASALT) will be difficult. Wouldn't it be better to make this configurable?
  2. Locations also for other databases, e.g. CheckM, would be good if one could specify with a parameter.
  3. BASALT is a pipeline in itself. It feels like this could lead to programs, e.g. binners but also CheckM and maybe others, being run more than once. Is there a convenient way around this?
  4. Does BASALT work only with unzipped input files?

Thanks very much for your suggestions! We will be definitely happy to make it available at Biocontainers. We will manage to integrate it in the next version.

As per your questions/suggestions:

  1. We have also noticed that this could be an issue for user to handle. Currently, running with containers such as singularity can address this issue, but we will try to make this configurable in the next version.
  2. Sure we can add this option.
  3. Frankly, this is one of our main task to optimise, especially CheckM, as it takes a long time to run.
  4. Yes, BASALT works with both zipped and unzipped files.

Thanks again for your suggestions, your experience is very important to us. We will try our best to make BASALT better and user friendly.

Cheers,
BASALT team

Thanks for your reply!

Regarding CheckM taking long to run -- CheckM2 is much faster and, according to their paper, better!