bcbio/bcbio_docker

Current docker image not functional

gabeng opened this issue · 10 comments

The current docker image 1.2.4-9722272 is not functional, i. e. bcbio cannot be executed, see also bcbio/bcbio-nextgen#3240.
It would be useful to incorporate a small smoke test (does bcbio_nextgen.py --version return an error?) into the docker build job.

Looks like it's the same issue as bcbio/bcbio-nextgen#3318:

root@07f9269088d4:/# samtools --version
samtools: error while loading shared libraries: libcrypto.so.1.0.0: cannot open shared object file: No such file or directory

samtools is pinned to 1.9 in https://github.com/chapmanb/cloudbiolinux/blob/master/contrib/flavor/ngs_pipeline_minimal/packages-conda.yaml#L136 but not in https://github.com/bcbio/bcbio_docker/blob/master/packages/bcbio-vc.yaml#L54 or in https://github.com/bcbio/bcbio_docker/blob/master/packages/bcbio-rnaseq.yaml#L25

roryk commented

Thanks for figuring this out! Can you pin them to 1.9.

The source of the other error is a bit more difficult to track down:

File "/usr/local/bin/bcbio_nextgen.py", line 34, in <module>
    from bcbio.setpath import prepend_bcbiopath
ModuleNotFoundError: No module named 'bcbio'

@hackdna,

This is really tricky. I can probably not be of much help here, except to offer some observations:

I can do without atropos and deepvariant at the moment and would be happy to test the other tools under python 3.7.
How can an installation process that first installs bcbio under a python 3.7 path and subsequently removes python 3.7 work at all? It must have worked for you at some point, but I cannot get it to work in a docker image.
I am confused because all this should mean that nobody would be able to install or update bcbio at this moment. But somehow it is limited to the docker image?

I can offer to setup a build job generating the bcbio-vc docker image and performing a simple smoke test on a weekly basis to provide you with regular feedback - if that helps. I am afraid this is something that can break again any time.

Regards,

Ben

Thank you for the additional information. The bcbio installation process is definitely very complex. The tests that run as part of the bcbio CI actually do use Docker images (https://github.com/bcbio/bcbio-nextgen/blob/master/.travis.yml#L44), so that provides a smoke test (https://travis-ci.org/github/bcbio/bcbio-nextgen/builds) but it might make sense to add some testing to the Docker image CI too. One puzzling thing is that the bcbio installation works correctly outside Docker, so I hope this will help with troubleshooting of this issue.

Thanks for the link to the CI. Looks like the last successful tests #4078 were run with the previous image from 5 months ago. In that docker image, bcbio was installed with python 3.6. Is there a way to tie bcbio-base to python 3.6 during installation?

When I perform a bcbio installation outside docker without --tools bcbio is installed with python 3.7:

sudo python bcbio_nextgen_install.py /usr/local/share/bcbio-nextgen --isolate --minimize-disk --nodata -u development

If you then try to perform tools installation, this will lead to a broken installation because everything is installed with python 3.6. This two-step process is exactly what happens in the dockerfile.

I've tried building bcbio-base image with https://repo.anaconda.com/miniconda/Miniconda3-4.4.10-Linux-x86_64.sh which is the latest version of Miniconda based on Python 3.6 and the build failed with this error:

ERROR conda.core.link:_execute(481): An error occurred while installing package 'conda-forge::idna-2.10-pyh9f0ad1d_0'.
FileNotFoundError(2, "No such file or directory: '/usr/local/share/bcbio-nextgen/anaconda/bin/python3.9'")

Since Python 3.6 is the oldest supported version (end of support: 2021-12-23), it looks like the correct fix is to update Cloudbiolinux to Python 3.7 (https://github.com/chapmanb/cloudbiolinux/blob/030234475142c9517652774779d17e289929ce88/cloudbio/package/conda.py#L13).

roryk commented

Sorry for taking so long fixing this, I think I've got it functional now. There were a couple of problems. 1) The docker images blew up to 2x the size, due to some changes in bioconda dependencies. I did some cleaning out of unnecessary files and for the time being disabled deepvariant from the bcbio-vc container to get it down to a reasonable size. I can click it back on if anyone is using it though.

I added tests to do as you suggested @gabeng, just see if bcbio works at all. This should catch problems like this one in the future.

Thx @roryk, I'll check it out right away!

I created a docker image from the current bcbio_docker, but the issue persists. Were you able to build a functional docker image?