imsb-uke/scGAN

Dockerfile installs different versions of Anndata and Numpy and non-working tensorflow

AJnsm opened this issue · 5 comments

AJnsm commented

I cloned the repo today using git clone https://github.com/imsb-uke/scGAN.

I want to run an interactive session using the provided docker container. I copy the files main.py, parameters.json, requirements.txt, __init__.py, and the directories estimators/ and preprocessing/ to the dockerfile/ directory, so it's all in context there. I also add the pbmc data in a separate folder, and add to the dockerfile:

COPY pbmc_data scGAN/pbmc_data/
COPY preprocessing scGAN/preprocessing/
COPY estimators scGAN/estimators/
COPY main.py scGAN/main.py
COPY __init__.py scGAN/__init__.py
COPY parameters.json scGAN/parameters.json

I cd dockerfile and then run docker build -t scgan_container ., and start an interactive session. I run pip list which reports that the installed version of numpy is 1.15.0, and Anndata is 0.6.18 (different from the listed requirements in the readme).

Furthermore, when I run a Python REPL and run import tensorflow, I get the following error:

# python
Python 3.5.2 (default, Nov 23 2017, 16:37:01)
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
  File "/usr/lib/python3.5/imp.py", line 242, in load_module
    return load_dynamic(name, filename, file)
  File "/usr/lib/python3.5/imp.py", line 342, in load_dynamic
    return _load(spec)
ImportError: libcuda.so.1: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/__init__.py", line 24, in <module>
    from tensorflow.python import pywrap_tensorflow  # pylint: disable=unused-import
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/__init__.py", line 49, in <module>
    from tensorflow.python import pywrap_tensorflow
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 74, in <module>
    raise ImportError(msg)
ImportError: Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
  File "/usr/lib/python3.5/imp.py", line 242, in load_module
    return load_dynamic(name, filename, file)
  File "/usr/lib/python3.5/imp.py", line 342, in load_dynamic
    return _load(spec)
ImportError: libcuda.so.1: cannot open shared object file: No such file or directory


Failed to load the native TensorFlow runtime.

See https://www.tensorflow.org/install/install_sources#common_installation_problems

for some common reasons and solutions.  Include the entire stack trace
above this error message when asking for help.

Running python main.py --param parameters.json --process gives the same error as it imports tensorflow as well.

Any ideas on what is going wrong? I am on a 2018 MacBook, so do not have an Nvidia gpu, but that should not be an issue right? I do see in pip list that only the tensorflow-gpu package has been installed, and not the normal tensorflow.

AJnsm commented

full pip list report:

# pip list
DEPRECATION: Python 3.5 reached the end of its life on September 13th, 2020. Please upgrade your Python as Python 3.5 is no longer maintained. pip 21.0 will drop support for Python 3.5 in January 2021. pip 21.0 will remove support for this functionality.
Package            Version
------------------ ----------------------
absl-py            0.2.0
anndata            0.6.18
astor              0.6.2
backcall           0.1.0
bleach             1.5.0
cffi               1.14.3
cmake              3.18.2.post1
cycler             0.10.0
decorator          4.3.0
entrypoints        0.2.3
gast               0.2.0
grpcio             1.11.0
h5py               2.7.1
html5lib           0.9999999
ipykernel          4.8.2
ipython            6.3.1
ipython-genutils   0.2.0
ipywidgets         7.2.1
jedi               0.12.0
Jinja2             2.10
joblib             0.14.1
jsonschema         2.6.0
jupyter            1.0.0
jupyter-client     5.2.3
jupyter-console    5.2.0
jupyter-core       4.4.0
kiwisolver         1.0.1
llvmlite           0.26.0
louvain            0.6.1
Markdown           2.6.11
MarkupSafe         1.0
matplotlib         2.2.2
mistune            0.8.3
MulticoreTSNE      0.1
natsort            7.0.1
nbconvert          5.3.1
nbformat           4.4.0
networkx           2.4
notebook           5.4.1
numba              0.35.0
numexpr            2.7.1
numpy              1.15.0
pandas             0.22.0
pandocfilters      1.4.2
parso              0.2.0
patsy              0.5.1
pexpect            4.5.0
pickleshare        0.7.4
Pillow             5.1.0
pip                20.3.4
prompt-toolkit     1.0.15
protobuf           3.5.2.post1
ptyprocess         0.5.2
pycparser          2.20
pycurl             7.43.0
Pygments           2.2.0
pygobject          3.20.0
pyparsing          2.2.0
python-apt         1.1.0b1+ubuntu0.16.4.1
python-dateutil    2.7.2
python-igraph      0.8.2
pytz               2018.4
pyzmq              17.0.0
qtconsole          4.3.1
scanpy             1.2.2
scikit-learn       0.19.1
scipy              1.1.0
seaborn            0.9.1
Send2Trash         1.5.0
setuptools         50.3.2
simplegeneric      0.8.1
six                1.11.0
sklearn            0.0
statsmodels        0.9.0
tables             3.6.1
tensorboard        1.8.0
tensorflow-gpu     1.8.0
termcolor          1.1.0
terminado          0.8.1
testpath           0.3.1
texttable          1.6.3
tornado            5.0.2
traitlets          4.3.2
wcwidth            0.1.7
webencodings       0.5.1
Werkzeug           0.14.1
wheel              0.37.0
widgetsnbextension 3.2.1

This are two different things.

The mismatched version was due to a different version in the requirements file and the readme. That should be fixed here: https://github.com/imsb-uke/scGAN/blob/fix_%2313_readme/README.md
Sorry, for this mistake. I'll merge this to master soon.
I also added a statement on how to build and use the docker image. There is no need to copy all those files to the docker container, but you can just use the --volume flag to mount the corresponding folder, e.g. docker run --volume PATH_TO_SCGAN:scgan, that helps to not create unnecessary copies of your data when building docker images, see here for more information on docker volumes.

The missing libcuda is because, as you mentioned, that you don't have a Nvidia GPU and thus not the CUDA drivers installed. As far as I know, with this Tensorflow version (1.8.0) there is no support for other GPUs.
Also the current implementation of scGAN depends on using a GPU and was never tested on CPU only.
Thus current code does not allow for CPU training. It is not recommended to train Deep Learning models, as scGAN, on CPU only since they require a large amount of computations. If you still want to try, I can assist you with pointing out changes which need to be done to maybe get CPU based training to work.

AJnsm commented

Ok, thanks for the quick response, that makes sense. Since the readme mentioned that GPU training was recommended, I assumed CPU should also work, but good to know it doesn't, I'll make sure to run it on some other hardware then. Thanks!
PS. Is the docker container hosted somewhere on dockerhub? AFAIK, most compute clusters don't support docker and want you to use e.g. singularity to pull images from dockerhub.

So far it is not hosted on dockerhub, but thanks for the hint. I'll discuss this with the team.

A docker image is now available here. I will update the readme accordingly.