matthewfeickert/cvmfs-venv

CVMFS arch is not compatible with modern TensorFlow wheels

matthewfeickert opened this issue · 1 comments

CVMFS LCG views have architecture that is not necessarily compliant with modern machine learning library wheels. For an example CVMFS view LCG_98python3 x86_64-centos7-gcc8-opt is copatible with tensorflow v2.1.0 but not tensorflow v2.8.0.

Example

$ ssh uchicago
[17:38] login02.af.uchicago.edu:~ $ mkdir debug && cd debug
[17:38] login02.af.uchicago.edu:~/debug $ curl -sLO https://raw.githubusercontent.com/matthewfeickert/cvmfs-venv/2a6831069b4164925736efc9e4f25549ae831b4a/atlas_setup.sh
[17:38] login02.af.uchicago.edu:~/debug $ . atlas_setup.sh debug
(debug) [17:39] login02.af.uchicago.edu:~/debug $ deactivate
[17:39] login02.af.uchicago.edu:~/debug $ python -m pip show tensorflow
[17:39] login02.af.uchicago.edu:~/debug $ python -m pip show tensorflow-cpu
Name: tensorflow-cpu
Version: 2.1.0
Summary: TensorFlow is an open source machine learning framework for everyone.
Home-page: https://www.tensorflow.org/
Author: Google Inc.
Author-email: packages@tensorflow.org
License: Apache 2.0
Location: /cvmfs/sft.cern.ch/lcg/views/LCG_98python3/x86_64-centos7-gcc8-opt/lib/python3.7/site-packages
Requires: grpcio, wrapt, opt-einsum, six, gast, wheel, scipy, astor, google-pasta, tensorflow-estimator, tensorboard, keras-preprocessing, protobuf, termcolor, numpy, absl-py, keras-applications
Required-by:
[17:39] login02.af.uchicago.edu:~/debug $ . debug/bin/activate
(debug) [17:39] login02.af.uchicago.edu:~/debug $ python -m pip install --upgrade 'tensorflow==2.1.0'
(debug) [17:39] login02.af.uchicago.edu:~/debug $ python -m pip show tensorflow
Name: tensorflow
Version: 2.1.0
Summary: TensorFlow is an open source machine learning framework for everyone.
Home-page: https://www.tensorflow.org/
Author: Google Inc.
Author-email: packages@tensorflow.org
License: Apache 2.0
Location: /home/feickert/debug/debug/lib/python3.7/site-packages
Requires: absl-py, astor, gast, google-pasta, grpcio, keras-applications, keras-preprocessing, numpy, opt-einsum, protobuf, scipy, six, tensorboard, tensorflow-estimator, termcolor, wheel, wrapt
Required-by: 
(debug) [17:39] login02.af.uchicago.edu:~/debug $ python -c 'import tensorflow as tf; import keras'
2022-02-10 17:36:11.112165: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer.so.6'; dlerror: libnvinfer.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /cvmfs/sft.cern.ch/lcg/releases/MCGenerators/thepeg/2.2.1-c1b37/x86_64-centos7-gcc8-opt/lib/ThePEG:/cvmfs/sft.cern.ch/lcg/releases/MCGenerators/herwig++/7.2.1-71099/x86_64-centos7-gcc8-opt/lib/Herwig:/cvmfs/sft.cern.ch/lcg/views/LCG_98python3/x86_64-centos7-gcc8-opt/lib/python3.7/site-packages/torch/lib:/cvmfs/sft.cern.ch/lcg/views/LCG_98python3/x86_64-centos7-gcc8-opt/lib/python3.7/site-packages/tensorflow_core:/cvmfs/sft.cern.ch/lcg/releases/java/8u222-884d8/x86_64-centos7-gcc8-opt/jre/lib/amd64:/cvmfs/sft.cern.ch/lcg/views/LCG_98python3/x86_64-centos7-gcc8-opt/lib64:/cvmfs/sft.cern.ch/lcg/views/LCG_98python3/x86_64-centos7-gcc8-opt/lib:/cvmfs/sft.cern.ch/lcg/releases/gcc/8.3.0-cebb0/x86_64-centos7/lib:/cvmfs/sft.cern.ch/lcg/releases/gcc/8.3.0-cebb0/x86_64-centos7/lib64:/cvmfs/sft.cern.ch/lcg/releases/binutils/2.30-e5b21/x86_64-centos7/lib:/cvmfs/sft.cern.ch/lcg/releases/R/3.6.3-2dabd/x86_64-centos7-gcc8-opt/lib64/R/library/readr/rcon
2022-02-10 17:36:11.112273: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer_plugin.so.6'; dlerror: libnvinfer_plugin.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /cvmfs/sft.cern.ch/lcg/releases/MCGenerators/thepeg/2.2.1-c1b37/x86_64-centos7-gcc8-opt/lib/ThePEG:/cvmfs/sft.cern.ch/lcg/releases/MCGenerators/herwig++/7.2.1-71099/x86_64-centos7-gcc8-opt/lib/Herwig:/cvmfs/sft.cern.ch/lcg/views/LCG_98python3/x86_64-centos7-gcc8-opt/lib/python3.7/site-packages/torch/lib:/cvmfs/sft.cern.ch/lcg/views/LCG_98python3/x86_64-centos7-gcc8-opt/lib/python3.7/site-packages/tensorflow_core:/cvmfs/sft.cern.ch/lcg/releases/java/8u222-884d8/x86_64-centos7-gcc8-opt/jre/lib/amd64:/cvmfs/sft.cern.ch/lcg/views/LCG_98python3/x86_64-centos7-gcc8-opt/lib64:/cvmfs/sft.cern.ch/lcg/views/LCG_98python3/x86_64-centos7-gcc8-opt/lib:/cvmfs/sft.cern.ch/lcg/releases/gcc/8.3.0-cebb0/x86_64-centos7/lib:/cvmfs/sft.cern.ch/lcg/releases/gcc/8.3.0-cebb0/x86_64-centos7/lib64:/cvmfs/sft.cern.ch/lcg/releases/binutils/2.30-e5b21/x86_64-centos7/lib:/cvmfs/sft.cern.ch/lcg/releases/R/3.6.3-2dabd/x86_64-centos7-gcc8-opt/lib64/R/library/readr/rcon
2022-02-10 17:36:11.112286: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:30] Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
Using TensorFlow backend.
(debug) $ python -m pip install --upgrade tensorflow
(debug) $ python -m pip show tensorflow
Name: tensorflow
Version: 2.8.0
Summary: TensorFlow is an open source machine learning framework for everyone.
Home-page: https://www.tensorflow.org/
Author: Google Inc.
Author-email: packages@tensorflow.org
License: Apache 2.0
Location: /home/feickert/debug/debug/lib/python3.7/site-packages
Requires: absl-py, astunparse, flatbuffers, gast, google-pasta, grpcio, h5py, keras, keras-preprocessing, libclang, numpy, opt-einsum, protobuf, setuptools, six, tensorboard, tensorflow-io-gcs-filesystem, termcolor, tf-estimator-nightly, typing-extensions, wrapt
Required-by: 
(debug) [17:40] login02.af.uchicago.edu:~/debug $ python -c 'import tensorflow as tf; import keras'
Traceback (most recent call last):
  File "/home/feickert/debug/debug/lib/python3.7/site-packages/tensorflow/python/pywrap_tensorflow.py", line 60, in <module>
    from tensorflow.python._pywrap_tensorflow_internal import *
ImportError: /home/feickert/debug/debug/lib/python3.7/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so: undefined symbol: _ZNK10tensorflow8OpKernel11TraceStringERKNS_15OpKernelContextEb

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/feickert/debug/debug/lib/python3.7/site-packages/tensorflow/__init__.py", line 37, in <module>
    from tensorflow.python.tools import module_util as _module_util
  File "/home/feickert/debug/debug/lib/python3.7/site-packages/tensorflow/python/__init__.py", line 36, in <module>
    from tensorflow.python import pywrap_tensorflow as _pywrap_tensorflow
  File "/home/feickert/debug/debug/lib/python3.7/site-packages/tensorflow/python/pywrap_tensorflow.py", line 76, in <module>
    f'{traceback.format_exc()}'
ImportError: Traceback (most recent call last):
  File "/home/feickert/debug/debug/lib/python3.7/site-packages/tensorflow/python/pywrap_tensorflow.py", line 60, in <module>
    from tensorflow.python._pywrap_tensorflow_internal import *
ImportError: /home/feickert/debug/debug/lib/python3.7/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so: undefined symbol: _ZNK10tensorflow8OpKernel11TraceStringERKNS_15OpKernelContextEb


Failed to load the native TensorFlow runtime.
See https://www.tensorflow.org/install/errors for some common causes and solutions.
If you need help, create an issue at https://github.com/tensorflow/tensorflow/issues and include the entire stack trace above this error message.

As an example, I'm able to get the following example script (example.tar.gz) to run in a virtual environment on a CVMFS machine with

(debug) $ python -m pip install --upgrade --force-reinstall 'tensorflow==2.1.0'

and

(debug) $ python -m pip install --upgrade --force-reinstall 'tensorflow==2.5.0'

but the tensorflow>=2.6.0 wheels have some errors that are incompatible with the arch.