elsasserlab/minute

Optimize minute environment setup

cnluzon opened this issue · 3 comments

New users seem to run into a variety of problems with conda when setting up minute. I am trying to think of the best way to circumvent this. One possible workaround I can think of is providing a singularity container with a valid envioronment that minute can run on (either conda or "native" on the container), however this is being somewhat difficult to deploy and perhaps also not so user friendly either in the end.

Alternatively, maybe also having a conda-lock version of the environment can be provided to fallback on when some recent package update breaks up the conda setup.

Do you have some examples of these problems that users have? If there are problems with the dependencies in the recipe, then I’d suggest trying to fix those (presumably by pinning them to single versions). If installation through Conda doesn’t work as it should, this needs to be taken care of anyway unless that installation method becomes deprecated.

An alternative to Singularity (now renamed to Apptainer) would be using something like PyInstaller to distribute self-contained executables.

Yes it's usually some issue related to the conda environment, although it is not always dependent on our side of things. On our side of things, as you mention:

  • Resolving the environment to a different set of versions that somehow does not work.

I wonder if one can have a fully pinned conda environment and a most up-to-date one with more recent versions in parallel, so if the latter does not work, one could always resort to the former.

The Singularity/Apptainer alternative is also quite crafty and I am uncertain that it would be of help for users with less experience. I will look into the PyInstaller option, in case it might be helpful for a broader audience.

Other issues:

  • Mac on recent versions of Mac OS / processor M2 seem to have issues resolving processor version automatically - https://stackoverflow.com/questions/76879889/conda-package-not-found-how-to-install-conda-packages-on-apple-m1-m2-chips-whi (this I have experienced myself and it is a bit tricky to get to work, but I don't know how to improve this situation from our side, rather than documenting this possibility and the workaround to it).
  • Some people have experienced an odd situation that I was not able to reproduce: where they seem to create the conda environment successfully, but upon running minute, somehow cutadapt version was their native (not the one installed in the conda environment), which was an older one, and that made the workflow crash. I don't know if in this case it is a problem of not activating the environment properly, though, and they seemed to fix it, but the interesting thing is that the same thing happened to two different users.
  • Some people have experienced an odd situation that I was not able to reproduce: where they seem to create the conda environment successfully, but upon running minute, somehow cutadapt version was their native (not the one installed in the conda environment), which was an older one, and that made the workflow crash.

I’ve seen this before and guess this is because the user installed Cutadapt with pip install --user cutadapt. Even within an activated Conda environment, packages installed this way will take precedence over the packages in the Conda environment. There are many Conda bug reports about this, see for example conda/conda#8770. Until the Conda devs can be convinced to change this behavior, the workaround is to export PYTHONNOUSERSITE=1. Maybe an idea would be to add os.environ['PYTHONNOUSERSITE'] = "1" to src/minute/cli/run.py.