Failed to find function dmlab_connect in library
kaustabpal opened this issue · 2 comments
kaustabpal commented
After building the Docker image using the Dockerfile for scalable_agent and running sudo docker run --name scalable_agent kaustab/scalable_agent
, I am getting the following error:
2020-07-07 00:27:46.528660: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-07-07 00:27:46.533306: I tensorflow/core/distributed_runtime/rpc/grpc_channel.cc:215] Initialize GrpcChannelCache for job local -> {0 -> localhost:41863}
2020-07-07 00:27:46.534586: I tensorflow/core/distributed_runtime/rpc/grpc_server_lib.cc:334] Started server with target: grpc://localhost:41863
INFO:tensorflow:Using dynamic batching.
INFO:tensorflow:Creating actor 0 with level explore_goal_locations_small
INFO:tensorflow:Creating actor 1 with level explore_goal_locations_small
INFO:tensorflow:Creating actor 2 with level explore_goal_locations_small
INFO:tensorflow:Creating actor 3 with level explore_goal_locations_small
INFO:tensorflow:Creating MonitoredSession, is_chief True
INFO:tensorflow:Create CheckpointSaverHook.
WARNING:tensorflow:Issue encountered when serializing py_process_processes.
Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore.
'PyProcess' object has no attribute 'name'
INFO:tensorflow:Starting all processes.
Failed to find function dmlab_connect in library!
Failed to find function dmlab_connect in library!
Traceback (most recent call last):
File "experiment.py", line 700, in <module>
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
Failed to find function dmlab_connect in library!
File "experiment.py", line 694, in main
train(action_set, level_names)
File "experiment.py", line 587, in train
hooks=[py_process.PyProcessHook()]) as session:
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/monitored_session.py", line 415, in MonitoredTrainingSession
stop_grace_period_secs=stop_grace_period_secs)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/monitored_session.py", line 826, in __init__
stop_grace_period_secs=stop_grace_period_secs)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/monitored_session.py", line 542, in __init__
h.begin()
File "/scalable_agent/py_process.py", line 192, in begin
tp.map(lambda p: p.start(), tf.get_collection(PyProcess.COLLECTION))
File "/usr/lib/python2.7/multiprocessing/pool.py", line 253, in map
return self.map_async(func, iterable, chunksize).get()
File "/usr/lib/python2.7/multiprocessing/pool.py", line 572, in get
raise self._value
RuntimeError: Failed to connect RL API
Failed to find function dmlab_connect in library!
Below is the contents of the Dockerfile I used to build the image:
FROM ubuntu:18.04
# Install dependencies.
# g++ (v. 5.4) does not work: https://github.com/tensorflow/tensorflow/issues/13308
RUN apt-get update && apt-get install -y \
curl \
wget \
zip \
unzip \
software-properties-common \
pkg-config \
g++-4.8 \
zlib1g-dev \
python \
lua5.1 \
liblua5.1-0-dev \
libffi-dev \
gettext \
freeglut3 \
libsdl2-dev \
libosmesa6-dev \
libglu1-mesa \
libglu1-mesa-dev \
python-dev \
build-essential \
git \
gnupg \
python-setuptools \
python-pip \
libjpeg-dev
# Install bazel
RUN echo "deb [arch=amd64] http://storage.googleapis.com/bazel-apt stable jdk1.8" | \
tee /etc/apt/sources.list.d/bazel.list && \
curl https://bazel.build/bazel-release.pub.gpg | \
apt-key add - && \
apt-get update && apt-get install -y bazel
# Install TensorFlow and other dependencies
RUN pip install tensorflow==1.9.0 dm-sonnet==1.23
# Build and install DeepMind Lab pip package.
# We explicitly set the Numpy path as shown here:
# https://github.com/deepmind/lab/blob/master/docs/users/build.md
RUN NP_INC="$(python -c 'import numpy as np; print(np.get_include())[5:]')" && \
git clone https://github.com/deepmind/lab.git --branch release-2019-02-04 && \
cd lab && \
sed -i 's@hdrs = glob(\[@hdrs = glob(["'"$NP_INC"'/\*\*/*.h", @g' python.BUILD && \
sed -i 's@includes = \[@includes = ["'"$NP_INC"'", @g' python.BUILD && \
bazel build -c opt python/pip_package:build_pip_package && \
pip install wheel && \
./bazel-bin/python/pip_package/build_pip_package /tmp/dmlab_pkg && \
pip install /tmp/dmlab_pkg/DeepMind_Lab-1.0-py2-none-any.whl --force-reinstall
# Install dataset (from https://github.com/deepmind/lab/tree/master/data/brady_konkle_oliva2008)
RUN mkdir dataset && \
cd dataset && \
pip install Pillow && \
curl -sS https://raw.githubusercontent.com/deepmind/lab/master/data/brady_konkle_oliva2008/README.md | \
tr '\n' '\r' | \
sed -e 's/.*```sh\(.*\)```.*/\1/' | \
tr '\r' '\n' | \
bash
# Clone.
RUN git clone https://github.com/deepmind/scalable_agent.git
WORKDIR scalable_agent
# Build dynamic batching module.
RUN TF_INC="$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_include())')" && \
TF_LIB="$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_lib())')" && \
g++-4.8 -std=c++11 -shared batcher.cc -o batcher.so -fPIC -I $TF_INC -O2 -D_GLIBCXX_USE_CXX11_ABI=0 -L$TF_LIB -ltensorflow_framework
# Run tests.
RUN python py_process_test.py
RUN python dynamic_batching_test.py
RUN python vtrace_test.py
# Run.
CMD ["sh", "-c", "python experiment.py --total_environment_frames=10000 --dataset_path=../dataset && python experiment.py --mode=test --test_num_episodes=5"]
# Docker commands:
# docker rm scalable_agent -v
# docker build -t scalable_agent .
# docker run --name scalable_agent scalable_agent
How do I resolve this?
tkoeppe commented
Hello -- which Dockerfile are you referring to? As far as I'm aware, we're not shipping (and hence not supporting) any Dockerfiles?
tkoeppe commented
Please reopen this issue if this is still a problem and you have more information.