gravitational/gravity

Docker client does not respect environment variables upon instantiation for tele build

pmikulskis opened this issue · 1 comments

Description

We are attempting using Gravity to package a Kubernetes application on CircleCi, and have tele build running in a container on a remote docker client.
When CircleCi spins up the remote Docker client, the environment variables needed to connect to the docker daemon are made available so that docker cli commands can connect to the daemon from the remote client.
The environmetn variables are described here in Docker's docs, quickly, they are:

DOCKER_HOST
DOCKER_MACHINE_NAME
DOCKER_TLS_VERIFY
NO_PROXY

Once we copy over the certificates into the remote docker client (to verify TLS to the docker daemon), and once we set the environment variables accordingly in the remote docker client, we can run a command like docker ps and get output, since the connection is established and the docker-cli is configured to use TLS instead of /var/run/docker.sock since DOCKER_TLS_VERIFY is set to 1.

However, when we run tele build, we get an error message that the client cannot connect to the daemon.
We were a bit unsure if this was an environmental issue with the container or some networking problem. Checking the Go code for creating the client shows that the client should honor these variables though.

A team member found that the docker daemon communication channel is actually hard-coded in as /var/run/docker/sock

This is preventing us from running tele build in a containerized fashion as presented by the docs in a separated Docker cli and daemon environment.

What happened:
when we run tele build, the process exits immediately with an error reporting that it cannot access the docker daemon on /var/run/docker.sock, or that Docker is not installed on the given machine

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):
Run tele build on any manifest, but have the docker daemon accessible not via /var/run/docker.sock but via TCP settings. We would hope to see tele instantiate the correct docker client, but you will find that even if the correct environment variables are set, and even if commands like docker ps work on the remote system, tele will not use the correct client and will still attempt to talk to the docker daemon on /var/run/docker.sock

Environment

  • Gravity version: 6.1.47
  • OS: debian-stretch:slim
  • Platform: CircleCI

Relevant Debug Logs If Applicable
the builder is not using the correct docker client:

INFO [BUILDER]   Using package cache from /tmp/tele-cache. builder/builder.go:417
DEBU [LOCAL]     Creating local environment. args:{/tmp/tele-cache /tmp/tele-cache false false false 0s 0s <nil> {[] 0} false} localenv/localenv.go:153
ERRO             Get http://unix.sock/version: dial unix /run/docker.sock: connect: no such file or directory builder/build.go:131
ERRO             Command failed. error:[
ERROR REPORT:
Original Error: *trace.BadParameterError docker is not running on this machine, please install it (https://docs.docker.com/engine/installation/) and make sure it can be used by a non-root user (https://docs.docker.com/engine/installation/linux/linux-postinstall/)
Stack Trace:
        /gopath/src/github.com/gravitational/gravity/lib/builder/build.go:132 github.com/gravitational/gravity/lib/builder.checkBuildEnv
        /gopath/src/github.com/gravitational/gravity/lib/builder/build.go:37 github.com/gravitational/gravity/lib/builder.Build
        /gopath/src/github.com/gravitational/gravity/tool/tele/cli/build.go:67 github.com/gravitational/gravity/tool/tele/cli.build
        /gopath/src/github.com/gravitational/gravity/tool/tele/cli/run.go:54 github.com/gravitational/gravity/tool/tele/cli.Run
        /gopath/src/github.com/gravitational/gravity/tool/tele/main.go:44 main.run
        /gopath/src/github.com/gravitational/gravity/tool/tele/main.go:35 main.main
        /go/src/runtime/proc.go:200 runtime.main
        /go/src/runtime/asm_amd64.s:1337 runtime.goexit

Are you restricted to using 6.1.x? If not, the 9.0.0-beta.1 should have a fix for this.