eclipse/adore

ERROR exporting to image

Themaksiest opened this issue · 6 comments

Hello.

I have encounter an issue while trying to pull and build adore.

Script approach

So far I have tried running the following line from here
bash <(curl -sSL https://raw.githubusercontent.com/DLR-TS/adore_tools/master/tools/adore_setup.sh)
This results in the ERROR exporting to image message being printed in the following parts of the output:

=> CACHED [plotlabserver_builder 3/3] RUN cd "/tmp/plotlabserver/plotlabserver" &&  bash build.sh                                                                                                            0.0s
 => ERROR exporting to image                                                                                                                                                                                  0.0s
 => => exporting layers                                                                                                                                                                                       0.0s
------
 > exporting to image:
------
ERROR: failed to solve: layer does not exist
make[5]: *** [Makefile:82: build] Error 1
make[4]: *** [Makefile:66: build_fast] Error 2
make[3]: *** [plotlabserver.mk:42: build_fast_plotlabserver] Error 2
make[2]: *** [Makefile:23: build] Error 2
make[1]: *** [adore_cli.mk:100: build_adore_cli] Error 2
make: *** [adore_cli/adore_cli.mk:91: build_fast_adore_cli] Error 2

However this is followed by ADORe was setup successfully! message at the end of the script execution.

After cd adore and make cli the same error message is printed out and build fails:

 => ERROR exporting to image                                                                                                                                                                                  0.0s
 => => exporting layers                                                                                                                                                                                       0.0s
------
 > exporting to image:
------
ERROR: failed to solve: layer does not exist
make[2]: *** [Makefile:82: build] Error 1
make[1]: *** [Makefile:66: build_fast] Error 2
make: *** [/<path-to-adore>/adore/plotlabserver/plotlabserver.mk:42: build_fast_plotlabserver] Error 2

I have also tried make clean after this which for some reason does not remove some docker containers created during the build process:

docker container ls -a
CONTAINER ID   IMAGE                   COMMAND                  CREATED          STATUS                      PORTS                                       NAMES
c0f45a9d7639   dfdced2c4acc            "/bin/bash"              19 minutes ago   Created                                                                 suspicious_wilson
766c5d801f2c   dfdced2c4acc            "/bin/bash"              20 minutes ago   Created                                                                 brave_snyder
e758860f6aea   dfdced2c4acc            "/bin/bash"              23 minutes ago   Created                                                                 jolly_gates
d79e9e888db5   dfdced2c4acc            "/bin/bash"              23 minutes ago   Created                                                                 reverent_gauss
1554259f848f   dfdced2c4acc            "/bin/bash"              24 minutes ago   Created                                                                 mystifying_mahavira
a9fcb7d5993a   dfdced2c4acc            "/bin/bash"              24 minutes ago   Created                                                                 elastic_haibt
6c28a848f536   apt-cacher-ng:latest    "/bin/sh -c 'chmod 7…"   25 minutes ago   Up 25 minutes               0.0.0.0:3142->3142/tcp, :::3142->3142/tcp   apt-cacher-ng
ae29e95afb26   hello-world             "/hello"                 25 minutes ago   Exited (0) 25 minutes ago                                               youthful_visvesvaraya

apt-cacher-ng container is also left running.

docker container ls
CONTAINER ID   IMAGE                  COMMAND                  CREATED          STATUS          PORTS                                       NAMES
6c28a848f536   apt-cacher-ng:latest   "/bin/sh -c 'chmod 7…"   28 minutes ago   Up 28 minutes   0.0.0.0:3142->3142/tcp, :::3142->3142/tcp   apt-cacher-ng

Github approach

After killing running containers, and removing the leftovers, and deleting leftover images, I tried the approach described in the getting started guide

After making sure that Requirements are met:

cat /etc/os-release | grep "VERSION=" | cut -d"=" -f2
"22.04.3 LTS (Jammy Jellyfish)"

df -h . | awk 'NR==2 {print "Available Free Space:", $4}'
Available Free Space: 60G

make --version
GNU Make 4.3
Built for x86_64-pc-linux-gnu

I cloned the git repo, and updated the submodules

git clone git@github.com:eclipse/adore.git
cd adore
git submodule update --init

Which I followed by make cli which produced the same error:

=> CACHED [plotlabserver_builder 3/3] RUN cd "/tmp/plotlabserver/plotlabserver" &&  bash build.sh                                                                                                            0.0s
 => ERROR exporting to image                                                                                                                                                                                  0.0s
 => => exporting layers                                                                                                                                                                                       0.0s
------
 > exporting to image:
------
ERROR: failed to solve: layer does not exist
make[2]: *** [Makefile:82: build] Error 1
make[1]: *** [Makefile:66: build_fast] Error 2
make: *** [/<path-to-adore>/adore/plotlabserver/plotlabserver.mk:42: build_fast_plotlabserver] Error 2

Has anyone encountered something similar? I would gladly accept any suggestions on troubleshooting this.

My personal observations

  • Yesterday I successfully built everything using the github approach and by running make build_all, however when running make cli afterward and executing cd adore_scenarios followed by roslaunch baseline_test.launch the example crashed after a while because of missing adore_if_ros_msg definition navigationgoal.
  • I had a separate repository in which I had built adore some 5 months ago just to test the concept of adore and in it I could successfully launch the baseline_test.launch. Only difference I found using git submodule status --recursive in both directories was that in the latest version adore_cli was on 6110ae7 instead of heads/master and adore_cli/plotlabserver was on remotes/origin/fix-memory-leak-14-g67ec8a4 instead of heads/master
  • However since then I ran make clean and building in both directories fail to build with the same error. So I am stuck not being able to run the examples and also having no idea, what changed during this time.

After manually

cd plotlabserver/
git submodule update --init --recursive
make build

And trying again to make cli I now run into a different error:

 => CACHED [adore_if_ros_builder 1/2] WORKDIR /tmp/adore_if_ros/adore_if_ros/build                                                                                                                            0.0s
 => CACHED [adore_if_ros_builder 2/2] RUN source /opt/ros/noetic/setup.bash &&     cmake ..              -DCMAKE_EXPORT_COMPILE_COMMANDS=ON              -DCMAKE_BUILD_TYPE=Release              -DCMAKE_INS  0.0s
 => exporting to image                                                                                                                                                                                        0.0s
 => => exporting layers                                                                                                                                                                                       0.0s
 => => writing image sha256:7d46c024a579a38fd0a5109835d901d5c77ccf5ea94d9be7aa241bbd145f7631                                                                                                                  0.0s
 => => naming to docker.io/library/adore_if_ros:3d04171                                                                                                                                                       0.0s
rm -rf "adore_if_ros/build"
docker cp $(docker create --rm adore_if_ros:3d04171):/tmp/adore_if_ros/adore_if_ros/build adore_if_ros
Successfully copied 463MB to /<path-to-adore>/adore/adore_if_ros/adore_if_ros
cd /<path-to-adore>/adore/apt_cacher_ng_docker && make up
Apt-Cacher NG already running statistics dashboard is located at: http://127.0.0.1:3142/acng-report.html
cd /<path-to-adore>/adore/plotlabserver && make build_fast_plotlabserver
cd "/<path-to-adore>/adore/plotlabserver" && make build_fast
Docker image: plotlabserver_build:0fad7e5 already build, skipping build.
Docker image: plotlabserver:0fad7e5 already build, skipping build.
docker cp $(docker create --rm plotlabserver_build:0fad7e5):/tmp/plotlabserver/plotlabserver/build "/<path-to-adore>/adore/plotlabserver/plotlabserver"
Successfully copied 5.91MB to /<path-to-adore>/adore/plotlabserver/plotlabserver
mkdir -p adore_cli/build
cd "/<path-to-adore>/adore/adore_cli" && \
    docker compose -f /<path-to-adore>/adore/docker-compose.yaml build adore_cli \
                         --build-arg ADORE_CLI_PROJECT=adore_cli \
                         --build-arg ADORE_CLI_PROJECT_X11_DISPLAY=adore_cli_x11_display \
                         --build-arg UID=1000 \
                         --build-arg GID=1000 \
                         --build-arg DOCKER_GID=999 \
                         --build-arg ADORE_IF_ROS_TAG=3d04171 && \
    docker compose -f /<path-to-adore>/adore/docker-compose.yaml build adore_cli_x11_display \
                         --build-arg ADORE_CLI_PROJECT=adore_cli \
                         --build-arg ADORE_CLI_PROJECT_X11_DISPLAY=adore_cli_x11_display \
                         --build-arg UID=1000 \
                         --build-arg GID=1000 \
                         --build-arg DOCKER_GID=999 \
                         --build-arg ADORE_CLI_TAG=6110ae7
service "adore_cli" can't be used with `extends` as it declare `depends_on`
make[2]: *** [Makefile:25: build] Error 15
make[1]: *** [adore_cli.mk:100: build_adore_cli] Error 2
make: *** [adore_cli/adore_cli.mk:91: build_fast_adore_cli] Error 2

P.S. I changed the actual path to the adore repo to , hopefully this does not cause any confusion or issue.

@Themaksiest I am looking into the first issue to see if I can replicate it.

The second issue is related to an update in docker (see: docker/compose#11544). A pr will be coming for this. A temporary fix is to downgrade docker as suggested in the linked issue. The ADORe installation script installs the latest version of docker.

I will push a fix for this pronto. Finally, @Themaksiest thank you for taking the time to document this and helping us improve the software!

@Themaksiest A follow up. I was unable to replicate the issue running the ADORe installer in a clean environment. You said you ran this many months ago. Docker cache can become stale which can cause non-deterministic failures. What I can recommend you try is to clean your docker cache and running the installer again.

⚠️ WARNING: Destructive Operation

Executing the following commands will stop all Docker containers, remove all containers and images, and clean up the Docker build and system cache. This can result in loss of important data if not used carefully. Proceed with caution.


Step-by-Step Guide to Clean Docker Environment

  1. Stop all running Docker containers:
docker stop $(docker ps -a -q)
  1. Remove all Docker containers:
docker rmi $(docker images -a -q)
  1. Remove all Docker build and system cache:
docker builder prune --all --force
docker system prune --all --force
  1. Restart Docker service (optional): Often restarting the docker service can resolve a lot of issues
systemctl restart docker
  1. Verify Docker system cache (optional):
docker system df

Finally, after the docker system cache has been purged you can try rerunning the ADORe installer.

bash <(curl -sSL https://raw.githubusercontent.com/DLR-TS/adore_tools/master/tools/adore_setup.sh)

Let me know if any of this works for you otherwise we can keep troubleshooting.

@Themaksiest I have pushed an update (DLR-TS/adore_cli@e784ece) to the adore_cli (see: https://github.com/DLR-TS/adore_cli) submodule addressing the issue introduced by an update to docker (see: docker/compose#11544). This will be included in a PR that will follow shortly after this post; I will also keep this issue open until the PR is accepted. In the meantime you can manually pull the updates:

cd adore/adore_cli
git checkout master
git pull
cd ..
make cli

After this the baseline scenario should run as you previously proposed with:

make cli
cd adore_scenarios
roslaunch baseline_test.launch

Also before starting the cli make sure all docker containers are stopped:

make stop_adore_cli
make cli

I'm here if you have questions or need further support.

@akoerner1 Thank you for such a swift response!

On clean install after checking out master in adore_cli as you suggested in your comment I still got ERROR exporting to image in the plotlabserver submodule build process.

However I did try

cd plotlabserver/
git submodule update --init --recursive

An afterwards I was able to make cli and run the baseline test.
Thanks again the swift response and the support.

P.S. It seems to me that, when running baseline_test in a new cli after closing it, 2 plotlabserver_build:0fad7e5 and 1 adore_if_ros:3d04171 have been create and are not cleaned up. I guess it's only an inconvenience, but I now have around 5 of those container triplets laying around. Or is this an expected behavior because no one should run that test so often?

@Themaksiest

Or is this an expected behavior because no one should run that test so often?...

This is definitely a bug. Depending on how the process terminates in the context of docker compose containers can be left orphaned especially when starting them in detached mode with the docker compose --detach flag.

This problem is magnified significantly with the previous issue introduced by the docker update (see: docker/compose#11544) making composition of services a real challenge. The adore cli and plotlabserver now need to be started in two separate docker compose contexts instead of one context - it feels like playing whack-a-mole. I will try to address this issue. Thank you for providing feedback on your experience; again it is greatly appreciated!