
bug: bootstrap failed with timeout 5s: cannot connect to buildkitd in version 0.3.36

Are you use the envd server?

  • Yes, I am using the envd server.
  • No, I am not using the envd server.

Describe the bug

bootstrap failed with the same problem of the closed issue #709 in the latest version (0.3.36). I've tried

docker rm envd_buildkitd

and retry the bootstrap, but get the report that the container doesn't exist.

To Reproduce

rika@gult:~$ envd --debug bootstrap --dockerhub-mirror https://docker.mirrors.sjtug.sjtu.edu.cn > ~/envd.bootstrap.fail.log
DEBU[2023-08-03T16:43:51+08:00] /home/rika/.config/envd/id_rsa_envd.pub already present
DEBU[2023-08-03T16:43:51+08:00] /home/rika/.config/envd/id_rsa_envd already present
DEBU[2023-08-03T16:43:51+08:00] home manager initialized cache-dir=/home/rika/.cache/envd cache-map="map[oh-my-zsh:true]" cache-status=/home/rika/.cache/envd/cache.status config-file=/home/rika/.config/envd/config.envd context="{default [{default docker-container envd_buildkitd docker }]}" context-file=/home/rika/.config/envd/contexts
DEBU[2023-08-03T16:43:51+08:00] telemetry initialization UID=b3cd4db1-1ff1-4cfd-829b-fdcc99f6b5b8
DEBU[2023-08-03T16:43:51+08:00] sending telemetry
INFO[2023-08-03T16:43:51+08:00] [1/5] Bootstrap SSH Key
DEBU[2023-08-03T16:43:51+08:00] /home/rika/.config/envd/id_rsa_envd.pub already present
DEBU[2023-08-03T16:43:51+08:00] /home/rika/.config/envd/id_rsa_envd already present
INFO[2023-08-03T16:43:51+08:00] [2/5] Bootstrap registry CA keypair
INFO[2023-08-03T16:43:51+08:00] [3/5] Bootstrap registry json config
INFO[2023-08-03T16:43:51+08:00] [4/5] Bootstrap autocomplete
INFO[2023-08-03T16:43:51+08:00] Install bash autocompletion
WARN[2023-08-03T16:43:51+08:00] Warning: failed writing to /usr/share/bash-completion/completions/envd: open /usr/share/bash-completion/completions/envd: permission denied
INFO[2023-08-03T16:43:51+08:00] You may have to restart your shell for autocomplete to get initialized (e.g. run "exec $SHELL")
INFO[2023-08-03T16:43:51+08:00] [5/5] Bootstrap buildkit
DEBU[2023-08-03T16:43:51+08:00] bootstrap the buildkitd container
DEBU[2023-08-03T16:43:51+08:00] commandconn: starting docker with [exec -i envd_buildkitd buildctl dial-stdio]
DEBU[2023-08-03T16:43:51+08:00] starting buildkitd buildkit-config="&{[{docker.io false https://docker.mirrors.sjtug.sjtu.edu.cn}]}" container=envd_buildkitd tag="docker.io/moby/buildkit:v0.10.6"
DEBU[2023-08-03T16:43:51+08:00] commandconn (docker):Error response from daemon: No such container: envd_buildkitd
DEBU[2023-08-03T16:43:52+08:00] container is running, check if it's ready at docker-container://envd_buildkitd... container=envd_buildkitd driver=docker-container image="docker.io/moby/buildkit:v0.10.6" socket=envd_buildkitd
DEBU[2023-08-03T16:43:52+08:00] waiting to connect to buildkitd container=envd_buildkitd driver=docker-container image="docker.io/moby/buildkit:v0.10.6" socket=envd_buildkitd
DEBU[2023-08-03T16:43:53+08:00] commandconn: starting docker with [exec -i envd_buildkitd buildctl dial-stdio]
DEBU[2023-08-03T16:43:53+08:00] commandconn (docker):Error response from daemon: No such container: envd_buildkitd
DEBU[2023-08-03T16:43:53+08:00] failed to connect to buildkitd: failed to list workers: Unavailable: connection error: desc = "error reading server preface: command [docker exec -i envd_buildkitd buildctl dial-stdio] has exited with exit status 1, please make sure the URL is valid, and Docker 18.09 or later is installed on the remote host: stderr=Error response from daemon: No such container: envd_buildkitd\n"
DEBU[2023-08-03T16:43:54+08:00] failed to connect to buildkitd: failed to list workers: Unavailable: connection error: desc = "error reading server preface: command [docker exec -i envd_buildkitd buildctl dial-stdio] has exited with exit status 1, please make sure the URL is valid, and Docker 18.09 or later is installed on the remote host: stderr=Error response from daemon: No such container: envd_buildkitd\n"
DEBU[2023-08-03T16:43:55+08:00] commandconn: starting docker with [exec -i envd_buildkitd buildctl dial-stdio]
DEBU[2023-08-03T16:43:55+08:00] commandconn (docker):Error response from daemon: No such container: envd_buildkitd
DEBU[2023-08-03T16:43:55+08:00] failed to connect to buildkitd: failed to list workers: Unavailable: connection error: desc = "error reading server preface: command [docker exec -i envd_buildkitd buildctl dial-stdio] has exited with exit status 1, please make sure the URL is valid, and Docker 18.09 or later is installed on the remote host: stderr=Error response from daemon: No such container: envd_buildkitd\n"
DEBU[2023-08-03T16:43:56+08:00] failed to connect to buildkitd: failed to list workers: Unavailable: connection error: desc = "error reading server preface: command [docker exec -i envd_buildkitd buildctl dial-stdio] has exited with exit status 1, please make sure the URL is valid, and Docker 18.09 or later is installed on the remote host: stderr=Error response from daemon: No such container: envd_buildkitd\n"
error: failed to create buildkit client: failed to bootstrap the buildkitd: failed to connect to buildkitd docker-container://envd_buildkitd: timeout 5s: cannot connect to buildkitd
(1) attached stack trace
-- stack trace:
| github.com/tensorchord/envd/pkg/app.buildkit
| /home/runner/work/envd/envd/pkg/app/bootstrap.go:417
| github.com/tensorchord/envd/pkg/app.bootstrap
| /home/runner/work/envd/envd/pkg/app/bootstrap.go:114
| github.com/urfave/cli/v2.(*Command).Run
| /home/runner/go/pkg/mod/github.com/urfave/cli/v2@v2.25.7/command.go:274
| github.com/urfave/cli/v2.(*Command).Run
| /home/runner/go/pkg/mod/github.com/urfave/cli/v2@v2.25.7/command.go:267
| github.com/urfave/cli/v2.(*App).RunContext
| /home/runner/go/pkg/mod/github.com/urfave/cli/v2@v2.25.7/app.go:332
| github.com/urfave/cli/v2.(*App).Run
| /home/runner/go/pkg/mod/github.com/urfave/cli/v2@v2.25.7/app.go:309
| main.run
| /home/runner/work/envd/envd/cmd/envd/main.go:39
| main.main
| /home/runner/work/envd/envd/cmd/envd/main.go:67
| runtime.main
| /opt/hostedtoolcache/go/1.19.10/x64/src/runtime/proc.go:250
Wraps: (2) failed to create buildkit client
Wraps: (3) attached stack trace
-- stack trace:
| github.com/tensorchord/envd/pkg/buildkitd.NewClient
| /home/runner/work/envd/envd/pkg/buildkitd/buildkitd.go:135
| github.com/tensorchord/envd/pkg/app.buildkit
| /home/runner/work/envd/envd/pkg/app/bootstrap.go:414
| github.com/tensorchord/envd/pkg/app.bootstrap
| /home/runner/work/envd/envd/pkg/app/bootstrap.go:114
| github.com/urfave/cli/v2.(*Command).Run
| /home/runner/go/pkg/mod/github.com/urfave/cli/v2@v2.25.7/command.go:274
| github.com/urfave/cli/v2.(*Command).Run
| /home/runner/go/pkg/mod/github.com/urfave/cli/v2@v2.25.7/command.go:267
| github.com/urfave/cli/v2.(*App).RunContext
| /home/runner/go/pkg/mod/github.com/urfave/cli/v2@v2.25.7/app.go:332
| github.com/urfave/cli/v2.(*App).Run
| /home/runner/go/pkg/mod/github.com/urfave/cli/v2@v2.25.7/app.go:309
| main.run
| /home/runner/work/envd/envd/cmd/envd/main.go:39
| main.main
| /home/runner/work/envd/envd/cmd/envd/main.go:67
| runtime.main
| /opt/hostedtoolcache/go/1.19.10/x64/src/runtime/proc.go:250
Wraps: (4) failed to bootstrap the buildkitd
Wraps: (5) attached stack trace
-- stack trace:
| github.com/tensorchord/envd/pkg/buildkitd.(*generalClient).maybeStart
| /home/runner/work/envd/envd/pkg/buildkitd/buildkitd.go:182
| [...repeated from below...]
Wraps: (6) failed to connect to buildkitd docker-container://envd_buildkitd
Wraps: (7) attached stack trace
-- stack trace:
| github.com/tensorchord/envd/pkg/buildkitd.generalClient.waitUntilConnected
| /home/runner/work/envd/envd/pkg/buildkitd/buildkitd.go:213
| github.com/tensorchord/envd/pkg/buildkitd.(*generalClient).maybeStart
| /home/runner/work/envd/envd/pkg/buildkitd/buildkitd.go:181
| github.com/tensorchord/envd/pkg/buildkitd.(*generalClient).Bootstrap
| /home/runner/work/envd/envd/pkg/buildkitd/buildkitd.go:142
| github.com/tensorchord/envd/pkg/buildkitd.NewClient
| /home/runner/work/envd/envd/pkg/buildkitd/buildkitd.go:134
| github.com/tensorchord/envd/pkg/app.buildkit
| /home/runner/work/envd/envd/pkg/app/bootstrap.go:414
| github.com/tensorchord/envd/pkg/app.bootstrap
| /home/runner/work/envd/envd/pkg/app/bootstrap.go:114
| github.com/urfave/cli/v2.(*Command).Run
| /home/runner/go/pkg/mod/github.com/urfave/cli/v2@v2.25.7/command.go:274
| github.com/urfave/cli/v2.(*Command).Run
| /home/runner/go/pkg/mod/github.com/urfave/cli/v2@v2.25.7/command.go:267
| github.com/urfave/cli/v2.(*App).RunContext
| /home/runner/go/pkg/mod/github.com/urfave/cli/v2@v2.25.7/app.go:332
| github.com/urfave/cli/v2.(*App).Run
| /home/runner/go/pkg/mod/github.com/urfave/cli/v2@v2.25.7/app.go:309
| main.run
| /home/runner/work/envd/envd/cmd/envd/main.go:39
| main.main
| /home/runner/work/envd/envd/cmd/envd/main.go:67
| runtime.main
| /opt/hostedtoolcache/go/1.19.10/x64/src/runtime/proc.go:250
| runtime.goexit
| /opt/hostedtoolcache/go/1.19.10/x64/src/runtime/asm_amd64.s:1594
Wraps: (8) timeout 5s: cannot connect to buildkitd
Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *withstack.withStack (4) *errutil.withPrefix (5) *withstack.withStack (6) *errutil.withPrefix (7) *withstack.withStack (8) *errutil.leafError
error: timeout 5s: cannot connect to buildkitd

Expected behavior

The docker info output

rika@gult:~$ docker info
Client: Docker Engine - Community
Version: 24.0.2
Context: desktop-linux
Debug Mode: false
buildx: Docker Buildx (Docker Inc.)
Version: v0.10.5
Path: /usr/libexec/docker/cli-plugins/docker-buildx
compose: Docker Compose (Docker Inc.)
Version: v2.13.0
Path: /usr/lib/docker/cli-plugins/docker-compose
dev: Docker Dev Environments (Docker Inc.)
Version: v0.0.5
Path: /usr/lib/docker/cli-plugins/docker-dev
extension: Manages Docker extensions (Docker Inc.)
Version: v0.2.16
Path: /usr/lib/docker/cli-plugins/docker-extension
sbom: View the packaged-based Software Bill Of Materials (SBOM) for an image (Anchore Inc.)
Version: 0.6.0
Path: /usr/lib/docker/cli-plugins/docker-sbom
scan: Docker Scan (Docker Inc.)
Version: v0.22.0
Path: /usr/lib/docker/cli-plugins/docker-scan

Containers: 0
Running: 0
Paused: 0
Stopped: 0
Images: 1
Server Version: 20.10.21
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Cgroup Version: 2
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc io.containerd.runc.v2 io.containerd.runtime.v1.linux
Default Runtime: runc
Init Binary: docker-init
containerd version: 770bd0108c32f3fb5c73ae1264f7e503fe7b2661
runc version: v1.1.4-0-g5fd4c4d
init version: de40ad0
Security Options:
Profile: default
Kernel Version: 5.15.49-linuxkit
Operating System: Docker Desktop
OSType: linux
Architecture: x86_64
CPUs: 24
Total Memory: 7.67GiB
Name: docker-desktop
Docker Root Dir: /var/lib/docker
Debug Mode: false
HTTP Proxy: http.docker.internal:3128
HTTPS Proxy: http.docker.internal:3128
No Proxy: hubproxy.docker.internal
Experimental: false
Insecure Registries:
Live Restore Enabled: false

The envd version output

rika@gult:~$ envd version --detail
envd: v0.3.36
BuildDate: 2023-07-18T15:36:15Z
GitCommit: 854798c
GitTreeState: clean
GitTag: v0.3.36
GoVersion: go1.19.10
Compiler: gc
Platform: linux/amd64
OSType: linux
OSVersion: 22.04
KernelVersion: 5.19.0-50-generic
DockerHostVersion: 24.0.2
ContainerRuntimes: [io.containerd.runc.v2,runc]
DefaultRuntime: runc

Additional context

/assign @kemingy

Is it related to the sjut mirror?

Nope, i ran envd bootstrap without any subfix at the first attemp and got the same error.

I'm not able to reproduce the error. Can you check the log in the bulidkitd container?

Not sure if it's related to the HTTP proxy config.

I'm not able to reproduce the error. Can you check the log in the bulidkitd container?

Not sure if it's related to the HTTP proxy config.

No container can be found in my docker desktop. I'm afraid envd failed creating the envd_buildkitd container. Following is the output of docker ps, according to which the only container I have on this PC is the one created by envd 7 monthes ago when I first try to install it...btw, what can i do to check if it's related to the HTTP proxy config?

a116f6b27925 envd-quick-start:dev "horust" 7 months ago Up 22 hours>2222/tcp,>8888/tcp envd-quick-start

The current buildkitd container is started with something like:

docker run --rm --name envd_buildkitd --privileged -v $HOME/.config/envd:/etc/registry docker.io/moby/buildkit:v0.10.6 --config /etc/registry/buildkitd.toml

The $HOME/.config/envd/buildkitd.toml is generated when you run envd bootstrap.

Can you try to run this command directly?

rika@gult:~$ sudo docker run --rm --name envd_buildkitd --privileged -v $HOME/.config/envd:/etc/registry docker.io/moby/buildkit:v0.10.6 --config /etc/registry/buildkitd.toml

docker: Error response from daemon: Conflict. The container name "/envd_buildkitd" is already in use by container "3323cd6af59ac1e9a1df40d98e9ac0304b4ea67108fe115aee337077ef81812f". You have to remove (or rename) that container to be able to reuse that name.

Run docker rm envd_buildkitd to remove that old one. Use docker ps -a to check if there are some legacy buildkitd containers.