Issues running a Rancher container inside a Sysbox container
sanzenwin opened this issue · 10 comments
docker run --name=test -it nestybox/ubuntu-bionic-systemd-docker:latest
docker exec -it test bash
# Inside the Sysbox container
docker run --privileged rancher/rancher
ERROR: Rancher must be ran with the --privileged flag when running outside of Kubernetes
@sanzenwin, unfortunately Sysbox doesn't support Rancker/K3s yet, but this is something that we will be adding fairly soon.
Can you please let me know what's the use-case that you have in mind for Sysbox? Maybe we can offer you a workaround.
@rodnymolina, Rancher is easier to deploy than K8s, I trying to test it on single machine. I will focus on your another project :
https://github.com/nestybox/kindbox
Thanks @sanzenwin for reporting the issue.
docker run --privileged rancher/rancher
ERROR: Rancher must be ran with the --privileged flag when running outside of Kubernetes
It seems the rancher container entrypoint is looking for the presence of /dev/kmsg
and it's not finding it:
4 │ if [ ! -e /run/secrets/kubernetes.io/serviceaccount ] && [ ! -e /dev/kmsg ]; then
5 │ echo "ERROR: Rancher must be ran with the --privileged flag when running outside of Kubernetes"
6 │ exit 1
7 │ fi
It's strange because /dev/kmsg
is exposed inside the parent Sysbox container:
sysbox-container: /# ls -l /dev/kmsg
crw-rw-rw- 1 nobody nogroup 1, 3 Apr 21 19:25 /dev/kmsg
Thus we would expect that running inside a privileged container inside the sysbox container would also expose that device inside the container:
sysbox-container: /# docker run --privileged ubuntu:18.04 ls -l /dev/kmsg
ls: cannot access '/dev/kmsg': No such file or directory
We need to dig into why that is the case. I suspect the Docker instance running inside the Sysbox container did not like the "nobody:nogroup" on /dev/kmsg and as a result did not pass it into the inner Rancher container.
Fortunately it's easy to work-around this by passing the device into the container explicitly with --device
:
sysbox-container: /# docker run --privileged --device /dev/kmsg:/dev/kmsg -it rancher/rancher
That causes the rancher container to initialize. I am not familiar with Rancher (yet) so I can't tell if it initializes correctly, but it appears it did.
Please give that a try and let us know.
Thanks!
Good idea!
I see k3s control-plane coming up but there are a few errors being dumped by rancher, so i'm not sure how reliable this will be till we fully test it in our setups.
@streamnsight, let us know how it goes with Cesar's workaround.
root@441df534ab82:/var/lib/rancher# k3s kubectl get all --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
cattle-system pod/helm-operation-7trtw 0/2 Completed 0 17m
cattle-system pod/helm-operation-wlbm2 0/2 Completed 0 16m
fleet-system pod/fleet-agent-66c54576c6-5gtqh 1/1 Running 0 12m
fleet-system pod/fleet-controller-78b7d7d9cf-rddlw 1/1 Running 0 15m
fleet-system pod/gitjob-6d5565ffb-jthn5 1/1 Running 0 15m
kube-system pod/coredns-7944c66d8d-bf284 1/1 Running 0 17m
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default service/kubernetes ClusterIP 10.43.0.1 <none> 443/TCP 17m
fleet-system service/gitjob ClusterIP 10.43.229.23 <none> 80/TCP 15m
kube-system service/kube-dns ClusterIP 10.43.0.10 <none> 53/UDP,53/TCP,9153/TCP 17m
NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE
fleet-system deployment.apps/fleet-agent 1/1 1 1 14m
fleet-system deployment.apps/fleet-controller 1/1 1 1 15m
fleet-system deployment.apps/gitjob 1/1 1 1 15m
kube-system deployment.apps/coredns 1/1 1 1 17m
NAMESPACE NAME DESIRED CURRENT READY AGE
fleet-system replicaset.apps/fleet-agent-5f8bc46697 0 0 0 14m
fleet-system replicaset.apps/fleet-agent-66c54576c6 1 1 1 12m
fleet-system replicaset.apps/fleet-controller-78b7d7d9cf 1 1 1 15m
fleet-system replicaset.apps/gitjob-6d5565ffb 1 1 1 15m
kube-system replicaset.apps/coredns-7944c66d8d 1 1 1 17m
root@441df534ab82:/var/lib/rancher#
#sysbox-container:
docker run --privileged -p 80:80 -p 443:443 --device /dev/kmsg:/dev/kmsg -it rancher/rancher
#host machine:
curl 192.168.11.101 # the IP of host machine
#curl: (7) Failed to connect to 192.168.11.101 port 80: Connection refused
#docker:dind -container:
docker run --privileged -p 80:80 -p 443:443 -it rancher/rancher
#host machine:
curl 192.168.11.101 # the IP of host machine
#<a href="https://192.168.11.101/">Found</a>.
There is a network issue on it.
I am trying to deploy Rancher in single node, build clusters environments for dev server,testing server and so on, and finally deploy it to production(single-node or multi-node). The docker:dind is suggested for test environment only, so I want to use Sysbox, and deploy it to production.
My steps:
- Build sysbox-container as master, run Rancher in it, Rancher will build a default cluster.
- Build sysbox-container as node, run Rancher-Agent in it and build a new cluster.
- Repeat the 2th step, and build more clusters.
Can you offer me a workaround?
We also would love to have support for this. Is there any progress on this issue?
Also we found that rancher image fails to extract on docker running inside sysbox container:
inside sysbox container > docker run --privileged --device /dev/kmsg:/dev/kmsg -it rancher/rancher:v2.6-head
Unable to find image 'rancher/rancher:v2.6-head' locally
v2.6-head: Pulling from rancher/rancher
fa7b56d5c338: Pull complete
831c06a19f1c: Pull complete
9b07d273a2f4: Pull complete
5d7ac9c67454: Pull complete
4fede13eeff9: Pull complete
1ead93fe9b8f: Pull complete
06d4f82e466f: Pull complete
c545b5ac0e22: Pull complete
32c21992ee2f: Pull complete
b00b142b2e37: Pull complete
1a688fd93915: Pull complete
da7f4e0805f8: Extracting [==================================================>] 10.4MB/10.4MB
b860e69a3c7f: Download complete
3fb82d2cef36: Download complete
2810732276c3: Download complete
da305018a6f1: Download complete
8ccee01d29a6: Download complete
209521d1a443: Download complete
a14c7eef703a: Download complete
58be7072bca2: Download complete
docker: failed to register layer: ApplyLayer exit status 1 stdout: stderr: lchown /usr/bin/etcdctl: invalid argument.
See 'docker run --help'.
It pulls and stars rancher with dind setup.
Some context, we're running sysbox 0.5.x setup. All files are accessible from within /var/lib/docker and all belong to normal users:
inside sysbox container > sudo find /var/lib/docker | xargs -n 128 sudo ls -la | grep nobody | less
# empty
inside sysbox container > sudo find /var/lib/docker -user 65534
# empty
@aisbaa, I don't remember seeing this error in this context (i just reproduced) so it must be something new that we will need to look into.
Having said that, what's the use-case that you have in mind? Do you need the rancher-server to operate within a Sysbox container, or would it suffice to have any of its components (e.g., k3s, rke, rke2)? I'm asking coz the latter ones should work fine.
Having said that, what's the use-case that you have in mind?
We're using kubernetes pods as development environments for our engineer, we call those devpods. Currently we're using k3d as development environment for kuberentes.
Do you need the rancher-server to operate within a Sysbox container, or would it suffice to have any of its components (e.g., k3s, rke, rke2)? I'm asking coz the latter ones should work fine.
The end goal is to find working configuration for k3d or other tool that can run kuberentes inside docker. I tried running default k3d configuration and it did fail due to open /dev/kmsg: no such file or directory
:
devpod> k3d version
k3d version v5.4.3
k3s version v1.23.6-k3s1 (default)
devpod> k3d cluster create mycluster
...
devpod> docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
0c8675d3f3c9 ghcr.io/k3d-io/k3d-proxy:5.4.3 "/bin/sh -c nginx-pr…" 12 minutes ago Up 12 minutes 80/tcp, 0.0.0.0:45659->6443/tcp k3d-mycluster-serverlb
422fc2cf02ab rancher/k3s:v1.23.6-k3s1 "/bin/k3s server --t…" 12 minutes ago Restarting (1) 12 seconds ago k3d-mycluster-server-0
devpod> docker logs -f k3d-mycluster-server-0 2>&1 | tail
...
I0808 18:30:10.230478 32 apiserver.go:42] "Waiting for node sync before watching apiserver pods"
E0808 18:30:10.230535 32 kubelet.go:496] "Failed to create an oomWatcher (running in UserNS, Hint: enable KubeletInUserNamespace feature flag to ignore the error)" err="open /dev/kmsg: no such file or directory"
E0808 18:30:10.230556 32 server.go:298] "Failed to run kubelet" err="failed to run Kubelet: failed to create kubelet: open /dev/kmsg: no such file or directory"
E0808 18:30:10.230855 32 node.go:152] Failed to retrieve node info: nodes "k3d-mycluster-server-0" not found
Having said that I've noticed that sysbox should support k0s, which I don't recall if we evaluated. So we might be able to swap k3d with k0s.
P.S. Sorry conflating docker pull
issue with /dev/kmsg
.
P.S.S. I've only tried Community Edition of sysbox.