RobotLocomotion/spartan

Drake bazel-bin issues

manuelli opened this issue · 8 comments

@jamiesnape @jwnimmer-tri

We are again having issues with drake/.bazel-bin folder being symlinked to a secret location. Previously we were very careful to make sure that this folder persisted when the docker container was destroyed. Specifically, as outlined in #228 we added

echo "startup --output_base=/path/to/secret/folder" >> ~/.bazelrc

to our docker build. Now after merging @EricCousineau-TRI PR #257 we again have a problem. After exiting a docker container and then re-entering all the spartan/drake/.bazel-bin symlinks are broken. Specifically it now looks like they are symlinked to ~/.cache/bazel/.....

manuelli@ffd23a102ace:~/.spartan-build/drake-build$ ls -l install 
lrwxrwxrwx 1 manuelli manuelli 84 Dec 15 16:01 install -> /home/manuelli/.cache/bazel/_bazel_manuelli/install/697791709cf5cd5dd10a151e49e24677

I noticed during the docker build that bazel was updated to 0.9 from 0.6. Did this behavior change somehow? Is there a way to avoid these types of issues in the future?

Yes, that .bazelrc is probably not getting read anymore. RobotLocomotion/drake#8353 should help you.

I am having trouble following the details of that PR. All we want to do is to be able to control where the bazel-bin folder gets symlinked to. Previously we passed an arg to the bazel build, then we switched to the cmake build and added the aforementioned lines to the .bazelrc. Is there a stable API for doing this? If not we are going to have to wait on updating drake until there is, because each time we bump the drake sha it takes us a non-trivial amount of time to fix these issues.

Sorry about that! I had tried to run through Greg's simulation CI test; are there any additional tests that can be run before I (or anyone else) submits a downstream PR to spartan?

Yes, that .bazelrc is probably not getting read anymore. [...]

@jamiesnape Can you expand on that? Why would ~/.bazelrc not be getting read?
Also, from a quick look at RobotLocomotion/drake#8353, it looks like it may break the ability to specify /path/to/secret/folder. (Could that be added a CMake configuration variable?)

EDIT: Please disregard. Figured out why.

@manuelli I tried out the following patch on Drake master's (drake 91cf850) Docker build:

diff --git a/setup/docker/Dockerfile.ubuntu16.04.opensource b/setup/docker/Dockerfile.ubuntu16.04.opensource
index c216d20..f925ed1 100644
--- a/setup/docker/Dockerfile.ubuntu16.04.opensource
+++ b/setup/docker/Dockerfile.ubuntu16.04.opensource
@@ -5,5 +5,7 @@ RUN apt-get update && yes "Y" \
       | /drake/setup/ubuntu/16.04/install_prereqs.sh \
       && rm -rf /var/lib/apt/lists/* \
       && apt-get clean all
-RUN cd /drake && bazel build //tools:drake_visualizer && bazel build //examples/acrobot:acrobot_run_passive
-ENTRYPOINT ["/drake/setup/docker/entrypoint.sh"]
+RUN echo "startup --output_base=/tmp/bazel" > ~/.bazelrc
+RUN cd /drake && bazel build //tools:drake_visualizer # && bazel build //examples/acrobot:acrobot_run_passive
+ENTRYPOINT ["bash"]
+#ENTRYPOINT ["/drake/setup/docker/entrypoint.sh"]

When I then run the container, and do ls -l, I see the symlinks respecting the ~/.bazelrc:

# cd /drake
# ls -l bazel-*
lrwxrwxrwx 1 root root 46 Mar 13 23:13 bazel-bin -> /tmp/bazel/execroot/drake/bazel-out/k8-opt/bin
lrwxrwxrwx 1 root root 25 Mar 13 23:13 bazel-drake -> /tmp/bazel/execroot/drake
lrwxrwxrwx 1 root root 51 Mar 13 23:13 bazel-genfiles -> /tmp/bazel/execroot/drake/bazel-out/k8-opt/genfiles
lrwxrwxrwx 1 root root 35 Mar 13 23:13 bazel-out -> /tmp/bazel/execroot/drake/bazel-out
lrwxrwxrwx 1 root root 51 Mar 13 23:13 bazel-testlogs -> /tmp/bazel/execroot/drake/bazel-out/k8-opt/testlogs

Is there a step or detail that I am missing?
An additional change here is this is using Bazel 0.10.1.

Ah, sorry, I see now; it's being overrode by the --bazelrc being specified in the CMake build.

BTW: Confirmed that we cannot chain .bazelrc files (would've been a nice workaround):
https://github.com/EricCousineau-TRI/repro/blob/4cd868df39c2af74b8ccae600a5c899c6a7d5919/bazel/bazel_rc_chain/repro.sh

@EricCousineau-TRI you didn't do anything wrong. Our tests were just insufficient. This issue only arises if you

  1. build docker image
  2. run the docker container and build spartan inside it.
  3. exit the container (at which point it gets destroyed)
  4. create a new container. At this point the spartan/drake/bazel-bin folder will be a broken symlink.

According to the manual (which is admittedly sometimes wrong), the file /etc/bazel.bazelrc would still be in effect even with Drake's CMake build passing --bazelrc= on the command line. The command-line switch causes us to ignore $HOME/.bazelrc but not the system-level one. Given the use case of an entire Docker image dedicated for a build, where only certain image-wide folders are preserved, configuring the output_base via /etc seems like the durable option.

I believe the change in RobotLocomotion/drake#8353 would have also helped here, if it were already merged. It places the --output_user_root in the ${PROJECT_BINARY_DIR} which is presumably also being persisted for other CMake-compiled artifacts.

This is fixed in current master