kubernetes-retired/kube-mesos-framework

Empty ServiceAccount volumes when using *mesos* provider

kuba-- opened this issue · 47 comments

This issue is similar to: kubernetes/kubernetes#31062

PLATFORMS e.g. 'uname -a':
Linux 3.10.0-327.10.1.el7.x86_64 #1 SMP Tue Feb 16 17:03:50 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

DCOS 1.7

# docker version
Client:
 Version:      1.12.1
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   23cf638
 Built:        
 OS/Arch:      linux/amd64

Server:
 Version:      1.12.1
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   23cf638
 Built:        
 OS/Arch:      linux/amd64
# kubectl version
Client Version: version.Info{Major:"1", Minor:"5+", GitVersion:"v1.5.0-alpha.0.127+ab0b937c2efabe", GitCommit:"ab0b937c2efabedbb401753c8f232a14790af131", GitTreeState:"clean", BuildDate:"2016-09-05T13:40:17Z", GoVersion:"go1.6.3", Compiler:"gc", Platform:"linux/amd64"}

Server Version: version.Info{Major:"", Minor:"", GitVersion:"v0.0.0-master+$Format:%h$", GitCommit:"$Format:%H$", GitTreeState:"not a git tree", BuildDate:"1970-01-01T00:00:00Z", GoVersion:"go1.6.3", Compiler:"gc", Platform:"linux/amd64"}

COMMANDS OR DAEMONS: -- list all related components

km apiserver \
  --address=${KUBERNETES_MASTER_IP} \
  --etcd-servers=http://${KUBERNETES_MASTER_IP}:4001 \
  --service-cluster-ip-range=10.10.10.0/24 \
  --port=8888 \
  --cloud-provider=mesos \
  --cloud-config=mesos-cloud.conf \
  --secure-port=0 \
  --service-account-key-file=/tmp/ca.key \
  --client-ca-file=/tmp/ca.crt \
  --tls-private-key-file=/tmp/ca.crt \
  --admission-control=ServiceAccount,DefaultStorageClass,AlwaysAdmit \
  --v=1
./km controller-manager \
  --master=${KUBERNETES_MASTER_IP}:8888 \
  --cloud-provider=mesos \
  --cloud-config=mesos-cloud.conf  \
  --service-account-private-key-file=/tmp/ca.key \
  --root-ca-file=/tmp/ca.crt \
  --v=1
km scheduler \
  --address=${KUBERNETES_MASTER_IP} \
  --mesos-master=${MESOS_MASTER} \
  --etcd-servers=http://${KUBERNETES_MASTER_IP}:4001 \
  --mesos-user=root \
  --api-servers=${KUBERNETES_MASTER_IP}:8888 \
  --cluster-dns=${KUBERNETES_MASTER_IP} \
  --cluster-domain=cluster.local \
  --v=1

DESCRIPTION: -- symptom of the problem a customer would see
The directory ``/var/run/secrets/kubernetes.io/serviceaccount/` inside a container is empty

even if docker inspect shows:

        "Mounts": [
            {
                "Source": "/var/lib/mesos/slave/slaves/daf2b55c-2f31-4bae-a702-9341b9b86b04-S0/frameworks/daf2b55c-2f31-4bae-a702-9341b9b86b04-0000/executors/ad6d20ef43cf1f6a_k8sm-executor/runs/26e93fbd-194e-4d1e-a6aa-54c9145bab8d/pods
/4493a04f-9463-11e6-ba22-0e6b70457fa1/volumes/kubernetes.io~secret/default-token-dunvp",
                "Destination": "/var/run/secrets/kubernetes.io/serviceaccount",
                "Mode": "ro",
                "RW": false,
                "Propagation": "rprivate"
            },

IMPACT: -- impact of problem in customer env (best/worse case scenarios)

Every pod which uses k8s api (e.g.: ./pkg/client/restclient/config.go
function:

// InClusterConfig returns a config object which uses the service account
// kubernetes gives to pods. It's intended for clients that expect to be
// running inside a pod running on kubernetes. It will return an error if
// called from a process not running in a kubernetes environment.
func InClusterConfig() (*Config, error)

will fail/crash because InClusterConfig checks if token and ca.crt file exist.

EXPECTED BEHAVIOR:
Volume /var/run/secrets/kubernetes.io/serviceaccount/ should be mounted inside container and contains secret files (ServiceAccountTokenKey, ServiceAccountRootCAKey)

HOW TO REPRODUCE/DETAILS OF PROBLEM: -- full details of the problem
Run kubernetes as a mesos framework in DCOS 1.7+ environment (but most likely it will be reproducible just with pure mesos).

Similar issue happened for openshift:
openshift/origin#10215

If I run everything on one machine mesos (master/slave), etcd, k8s (api, controll manager, scheduler) then I can see token and cert. files on the machine but insider docker, volume /var/run/secrets/kubernetes.io/serviceaccount/ is empty.

here are more details:

# docker ps
CONTAINER ID        IMAGE                                      COMMAND                  CREATED             STATUS              PORTS                                                           NAMES
3c0942299925        nginx                                      "nginx -g 'daemon off"   7 minutes ago       Up 7 minutes                                                                        k8s_nginx.974a593a_nginx_default_13e8b3ab-9ad1-11e6-bd83-06d4c0c42259_a8ec1ceb
483bd6115dfb        gcr.io/google_containers/pause-amd64:3.0   "/pause"                 8 minutes ago       Up 7 minutes        0.0.0.0:31000->80/tcp                                           k8s_POD.16fd2d02_nginx_default_13e8b3ab-9ad1-11e6-bd83-06d4c0c42259_276b11cc
9fb3c083038c        quay.io/coreos/etcd:v2.2.1                 "/etcd --listen-clien"   11 minutes ago      Up 11 minutes       0.0.0.0:4001->4001/tcp, 2379-2380/tcp, 0.0.0.0:7001->7001/tcp   etcd

# docker exec -it 3c0942299925 ls /var/run/secrets/kubernetes.io/serviceaccount

# docker inspect 3c0942299925 | grep serviceaccount
                "/var/lib/mesos/slaves/1e6f5e13-416f-4e57-85e5-488b2f21745d-S0/frameworks/1e6f5e13-416f-4e57-85e5-488b2f21745d-0000/executors/42956d5dfd21f9f3_k8sm-executor/runs/6f229d11-f96e-4cab-b3ac-bff7e23a36d0/pods/13e8b3ab-9ad1-11e6-bd83-06d4c0c42259/volumes/kubernetes.io~secret/default-token-hgyqt:/var/run/secrets/kubernetes.io/serviceaccount:ro",
                "Destination": "/var/run/secrets/kubernetes.io/serviceaccount",

# ls -l /var/lib/mesos/slaves/1e6f5e13-416f-4e57-85e5-488b2f21745d-S0/frameworks/1e6f5e13-416f-4e57-85e5-488b2f21745d-0000/executors/42956d5dfd21f9f3_k8sm-executor/runs/6f229d11-f96e-4cab-b3ac-bff7e23a36d0/pods/13e8b3ab-9ad1-11e6-bd83-06d4c0c42259/volumes/kubernetes.io~secret/default-token-hgyqt
total 0
lrwxrwxrwx 1 root root 13 Oct 25 16:35 ca.crt -> ..data/ca.crt
lrwxrwxrwx 1 root root 16 Oct 25 16:35 namespace -> ..data/namespace
lrwxrwxrwx 1 root root 12 Oct 25 16:35 token -> ..data/token


 "Mounts": [
            {
                "Source": "/var/lib/mesos/slaves/1e6f5e13-416f-4e57-85e5-488b2f21745d-S0/frameworks/1e6f5e13-416f-4e57-85e5-488b2f21745d-0000/executors/42956d5dfd21f9f3_k8sm-executor/runs/6f229d11-f96e-4cab-b3ac-bff7e23a36d0/pods/e4ca7621-9ad3-11e6-ae2c-06d4c0c42259/volumes/kubernetes.io~secret/default-token-hgyqt",
                "Destination": "/var/run/secrets/kubernetes.io/serviceaccount",
                "Mode": "",
                "RW": true,
                "Propagation": "rprivate"
            },
k82cn commented

@kuba-- , interesting found :).

On my "all in one" machine - Ubuntu 16.04.1 LTS (GNU/Linux 4.4.0-45-generic x86_64), where I have:

  • Docker version 1.12.1, build 23cf638
  • mesos 1.0.1
  • etcd 3.0.9
  • zookeeper 3.4.8-1
export KUBERNETES_MASTER_IP=$(hostname -i)
export KUBERNETES_MASTER=http://${KUBERNETES_MASTER_IP}:8888
export MESOS_MASTER=zk://$(hostname -i):2181/mesos
export PATH="$(pwd):$PATH"

I run:

docker run -d --hostname $(uname -n) --name etcd \
  -p 4001:4001 -p 7001:7001 "quay.io/coreos/etcd:v3.0.9" \
  etcd --listen-client-urls http://0.0.0.0:4001 \
  --advertise-client-urls http://${KUBERNETES_MASTER_IP}:4001

km apiserver \
  --address=${KUBERNETES_MASTER_IP} \
  --etcd-servers=http://${KUBERNETES_MASTER_IP}:4001 \
  --service-cluster-ip-range=10.10.10.0/24 \
  --port=8888 \
  --cloud-provider=mesos \
  --cloud-config=mesos-cloud.conf \
  --secure-port=0 \
  --service-account-key-file=/tmp/ca.key \
  --client-ca-file=/tmp/ca.crt \
  --tls-private-key-file=/tmp/ca.crt \
  --admission-control=ServiceAccount,DefaultStorageClass,AlwaysAdmit \
  --service-account-lookup=false \
  --v=1

km controller-manager \
  --master=${KUBERNETES_MASTER_IP}:8888 \
  --cloud-provider=mesos \
  --cloud-config=mesos-cloud.conf  \
  --service-account-private-key-file=/tmp/ca.key \
  --root-ca-file=/tmp/ca.crt \
  --v=1

km scheduler \
  --address=${KUBERNETES_MASTER_IP} \
  --mesos-master=${MESOS_MASTER} \
  --etcd-servers=http://${KUBERNETES_MASTER_IP}:4001 \
  --mesos-user=root \
  --api-servers=${KUBERNETES_MASTER_IP}:8888 \
  --cluster-dns=${KUBERNETES_MASTER_IP} \
  --cluster-domain=cluster.local \
  --v=1

everything looks fine!

I can see shared secrets on the host and inside running container

:~# kubectl get all --show-all
NAME                 CLUSTER-IP     EXTERNAL-IP   PORT(S)     AGE
svc/k8sm-scheduler   10.10.10.122   <none>        10251/TCP   44m
svc/kubernetes       10.10.10.1     <none>        443/TCP     2h

NAME       READY     STATUS    RESTARTS   AGE
po/nginx   1/1       Running   0          3m

:~# docker ps
CONTAINER ID        IMAGE                                      COMMAND                  CREATED             STATUS              PORTS                                                           NAMES
60d64f4ec043        nginx                                      "nginx -g 'daemon off"   3 minutes ago       Up 3 minutes                                                                        k8s_nginx.1a4979b6_nginx_default_66b1b884-9f79-11e6-a266-063daa2fc083_e818564a
5faf08f6178d        gcr.io/google_containers/pause-amd64:3.0   "/pause"                 3 minutes ago       Up 3 minutes        0.0.0.0:31000->80/tcp                                           k8s_POD.1b2d2c7a_nginx_default_66b1b884-9f79-11e6-a266-063daa2fc083_7c70f987
187f1f959502        quay.io/coreos/etcd:v3.0.9                 "etcd --listen-client"   2 hours ago         Up 2 hours          0.0.0.0:4001->4001/tcp, 2379-2380/tcp, 0.0.0.0:7001->7001/tcp   etcd

:~# docker exec -it 60d64f4ec043 ls -la /var/run/secrets/kubernetes.io/serviceaccount
total 4
drwxrwxrwt 3 root root  140 Oct 31 14:50 .
drwxr-xr-x 3 root root 4096 Oct 31 14:50 ..
drwxr-xr-x 2 root root  100 Oct 31 14:50 ..109810_31_10_14_50_44.750139816
lrwxrwxrwx 1 root root   33 Oct 31 14:50 ..data -> ..109810_31_10_14_50_44.750139816
lrwxrwxrwx 1 root root   13 Oct 31 14:50 ca.crt -> ..data/ca.crt
lrwxrwxrwx 1 root root   16 Oct 31 14:50 namespace -> ..data/namespace
lrwxrwxrwx 1 root root   12 Oct 31 14:50 token -> ..data/token

I have a Mesos cluster that has this exact problem running Docker 1.12, Kubernetes 1.4 and Mesos 1.0.1 with both CentOS 7 and Ubuntu hosts. I can see the @kuba appears to find that everything is fine, but this is not the case at all on our cluster and I cannot see what @kuba did to get the serviceaccount directory populated.

The config.v2.json Docker file shows that the serviceaccount ought to be populated, as are somne other mounts, but while the other mounts are populated the serviceaccount is not. I wonder whether this is a permission issue as the p[opulated mounts are rw while the serviceaccount is ro, just a thought as I have not looked at the source code.

Who is actually working on this? It looks like @k82cn but since @kuba seems to suggest it is working I would suggest that this is still a P1 bug as it is not possible to use Dashboard, DNS etc.

@robin-hunter I have also simple, tiny CentOS cluster on AWS. My instances are based on dcos-centos7-201607062103 (ami-05c9236a).
On my master I ran:

  • zookeeper:
/usr/lib/jvm/java-openjdk/bin/java -Dzookeeper.datadir.autocreate=false -Dzookeeper.log.dir=/var/log/zookeeper -Dzookeeper.root.logger=INFO,ROLLINGFILE -cp /usr/lib/zookeeper/bin/../build/classes:/usr/lib/zookeeper/bin/../build/lib/*.jar:/usr/lib/zookeeper/bin/../lib/slf4j-log4j12-1.6.1.jar:/usr/lib/zookeeper/bin/../lib/slf4j-api-1.6.1.jar:/usr/lib/zookeeper/bin/../lib/netty-3.2.2.Final.jar:/usr/lib/zookeeper/bin/../lib/log4j-1.2.15.jar:/usr/lib/zookeeper/bin/../lib/jline-0.9.94.jar:/usr/lib/zookeeper/bin/../zookeeper-3.4.5-cdh4.7.1.jar:/usr/lib/zookeeper/bin/../src/java/lib/*.jar:/etc/zookeeper/conf::/etc/zookeeper/conf:/usr/lib/zookeeper/*:/usr/lib/zookeeper/lib/* -Dzookeeper.log.threshold=INFO -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.local.only=false org.apache.zookeeper.server.quorum.QuorumPeerMain /etc/zookeeper/conf/zoo.cfg
  • mesos-master
/usr/sbin/mesos-master --zk=zk://localhost:2181/mesos --port=5050 --log_dir=/var/log/mesos --quorum=1 --work_dir=/var/lib/mesos
  • etcd
docker run -d --hostname $(uname -n) --name etcd \
  -p 4001:4001 -p 7001:7001 "quay.io/coreos/etcd:v3.0.9" \
  etcd --listen-client-urls http://0.0.0.0:4001 \
  --advertise-client-urls http://${KUBERNETES_MASTER_IP}:4001
  • km-apiserver:
    km apiserver
    --address=${KUBERNETES_MASTER_IP}
    --etcd-servers=http://${KUBERNETES_MASTER_IP}:4001
    --service-cluster-ip-range=10.10.10.0/24
    --port=8888
    --cloud-provider=mesos
    --cloud-config=mesos-cloud.conf
    --secure-port=0
    --service-account-key-file=ca.key
    --client-ca-file=ca.crt
    --tls-private-key-file=ca.crt
    --admission-control=ServiceAccount,DefaultStorageClass,AlwaysAdmit
    --service-account-lookup=false
    --v=1
  • km-controller-manager:
km controller-manager \
  --master=${KUBERNETES_MASTER_IP}:8888 \
  --cloud-provider=mesos \
  --cloud-config=mesos-cloud.conf  \
  --service-account-private-key-file=ca.key \
  --root-ca-file=ca.crt \
  --v=1
  • km-scheduler:
km scheduler \
  --address=${KUBERNETES_MASTER_IP} \
  --mesos-master=${MESOS_MASTER} \
  --etcd-servers=http://${KUBERNETES_MASTER_IP}:4001 \
  --mesos-user=root \
  --api-servers=${KUBERNETES_MASTER_IP}:8888 \
  --cluster-dns=${KUBERNETES_MASTER_IP} \
  --cluster-domain=cluster.local \
  --v=1

where mesos-cloud.conf is:

[mesos-cloud]
        mesos-master        = zk://localhost:2181/mesos 

also create ca.crt and ca.key files and export following env. vars:

export KUBERNETES_MASTER_IP=$(hostname -i)
export KUBERNETES_MASTER=http://${KUBERNETES_MASTER_IP}:8888
export MESOS_MASTER=zk://$(hostname -i):2181/mesos
export PATH="$(pwd):$PATH"

On slaves I just run:

# ${MESOS_MASTER} should be here private IP of the master
/usr/sbin/mesos-slave --master=${MESOS_MASTER} --log_dir=/var/log/mesos --work_dir=/var/lib/mesos

On all nodes I run docker service:

service docker start

Last but not least my security group lets all instances in the same security group communicate over almost all ports.

hope that helps

I have no files at all in the serviceaccount directory and what you do looks similar to what I do which is:

  • zookeeper (version 3.4.6)
java -Dzookeeper.log.dir=. -Dzookeeper.root.logger=INFO,CONSOLE -cp /opt/zookeeper/bin/../build/classes:/opt/zookeeper/../build/lib/*.jar:/opt/zookeeper/bin/../lib/slf4j-log4j12-1.6.1.jar:/opt/zookeeper/bin/../lib/slf4j-api-1.6.1.jar:/opt/zookeeper/bin/../lib/netty-3.7.0.Final.jar:/opt/zookeeper/bin/../lib/log4j-1.2.16.jar:/opt/zookeeper/bin/../lib/jline-0.9.94.jar:/opt/zookeeper/bin/../zookeeper-3.4.6.jar:/opt/zookeeper/bin/../src/java/lib/*.jar:/opt/zookeeper/bin/../conf: -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.local.only=false org.apache.zookeeper.server.quorum.QuorumPeerMain /opt/zookeeper/bin/../conf/zoo.cfg
  • mesos-master (version 1.0.1)
/usr/sbin/mesos-master \
    --cluster=mesos-interkube \
    --log_dir=/var/log/mesos \
    --logging_level=INFO \
    --port=5050 \
    --quorum=1 
    --work_dir=/tmp/mesos 
    --zk=zk://node6.crml.com:2181/mesos
  • etd (version 2.0.12)
docker run -d --hostname $(uname -n) \
    --name etcd \
    -p 4001:4001 -p 7001:7001 "quay.io/coreos/etcd:v2.0.12" \
    /etcd --listen-client-urls http://0.0.0.0:4001 \
   --advertise-client-urls http://192.168.1.35:4001
  • km-apiserver
km apiserver \
    --insecure-bind-address=192.168.1.35 \
    --bind-address=192.168.1.35 \
    --etcd-servers=http://192.168.1.35:4001 \
    --service-cluster-ip-range=10.10.10.0/24 \
    --insecure-port=8888 \
    --client-ca-file=/opt/kubernetes/certs/ca.crt \
    --tls-cert-file=/opt/kubernetes/certs/server.crt \
    --tls-private-key-file=/opt/kubernetes/certs/server.key \
    --admission-control=NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,ResourceQuota \
    --cloud-provider=mesos \
    --cloud-config=/etc/kubernetes-mesos/mesos-cloud.conf
  • km-controller-manager
km controller-manager \
    --master=192.168.1.35:8888 \ 
    --cloud-provider=mesos \
    --cloud-config=/etc/kubernetes-mesos/mesos-cloud.conf \
    --cluster-signing-cert-file=/opt/kubernetes/certs/server.pem \
    --root-ca-file=/opt/kubernetes/certs/ca.crt \
    --service-account-private-key-file=/opt/kubernetes/certs/server.key
  • km-scheduler
km scheduler \
    --address=192.168.1.35 \
    --mesos-master=192.168.1.35:5050 \
    --etcd-servers=http://192.168.1.35:4001 \
    --mesos-user=root \
    --api-servers=192.168.1.35:8888 \
    --cluster-dns=10.10.10.10 \
    --cluster-domain=cluster.local

where mesos cloud config is:

[mesos-cloud]
        mesos-master        = 192.168.1.35:5050
  • kubectl version
Client Version: version.Info{Major:"1", Minor:"4", GitVersion:"v1.4.1+33cf7b9", GitCommit:"33cf7b9acbb2cb7c9c72a10d6636321fb180b159", GitTreeState:"not a git tree", BuildDate:"2016-10-20T18:37:49Z", GoVersion:"go1.7.1", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"4", GitVersion:"v1.4.1+33cf7b9", GitCommit:"33cf7b9acbb2cb7c9c72a10d6636321fb180b159", GitTreeState:"not a git tree", BuildDate:"2016-10-20T18:37:49Z", GoVersion:"go1.7.1", Compiler:"gc", Platform:"linux/amd64"}

On agent nodes mesos is run using:

/usr/sbin/mesos-slave \
    --cgroups_hierarchy=/sys/fs/cgroup \
    --containerizers=docker,mesos \
    --isolation=posix/cpu,posix/mem \
    --log_dir=/var/log/mesos \
    --logging_level=INFO \
    --master=zk://192.168.1.35:2181/mesos \
    --port=5051 \
    --recover=reconnect \
    --strict \
    --work_dir=/tmp/mesos
  • The docker config.v2.json file for a test container (which does not manage to get injected files) is :
{
  "State": {
    "Running": true,
    "Paused": false,
    "Restarting": false,
    "OOMKilled": false,
    "RemovalInProgress": false,
    "Dead": false,
    "Pid": 22642,
    "StartedAt": "2016-11-14T17:18:00.201172809Z",
    "FinishedAt": "0001-01-01T00:00:00Z",
    "Health": null
  },
  "ID": "c382156b214945192ec765e2c6d85704b887885adb9f2e488e6e7de904817f11",
  "Created": "2016-11-14T17:17:59.447781341Z",
  "Managed": false,
  "Path": "sleep",
  "Args": [
    "10000"
  ],
  "Config": {
    "Hostname": "test-701078429-di1pa",
    "Domainname": "",
    "User": "",
    "AttachStdin": false,
    "AttachStdout": false,
    "AttachStderr": false,
    "Tty": false,
    "OpenStdin": false,
    "StdinOnce": false,
    "Env": [
      "MESOS_EXECUTOR_CONTAINER_UUID=ed787748-3c84-4de4-9acc-e8407d4aa526",
      "K8SM_SCHEDULER_SERVICE_HOST=10.10.10.44",
      "KUBERNETES_PORT_443_TCP_PORT=443",
      "KUBERNETES_SERVICE_PORT_HTTPS=443",
      "KUBERNETES_PORT_443_TCP=tcp://10.10.10.1:443",
      "K8SM_SCHEDULER_PORT=tcp://10.10.10.44:10251",
      "K8SM_SCHEDULER_PORT_10251_TCP_PROTO=tcp",
      "K8SM_SCHEDULER_PORT_10251_TCP_PORT=10251",
      "KUBERNETES_SERVICE_HOST=10.10.10.1",
      "KUBERNETES_SERVICE_PORT=443",
      "KUBERNETES_PORT_443_TCP_PROTO=tcp",
      "KUBERNETES_PORT_443_TCP_ADDR=10.10.10.1",
      "K8SM_SCHEDULER_SERVICE_PORT=10251",
      "K8SM_SCHEDULER_PORT_10251_TCP=tcp://10.10.10.44:10251",
      "K8SM_SCHEDULER_PORT_10251_TCP_ADDR=10.10.10.44",
      "KUBERNETES_PORT=tcp://10.10.10.1:443",
      "HOME=/",
      "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
    ],
    "Cmd": [
      "sleep",
      "10000"
    ],
    "Image": "tutum/curl",
    "Volumes": null,
    "WorkingDir": "",
    "Entrypoint": null,
    "OnBuild": null,
    "Labels": {
      "io.kubernetes.container.hash": "5aff3a15",
      "io.kubernetes.container.name": "test",
      "io.kubernetes.container.restartCount": "0",
      "io.kubernetes.container.terminationMessagePath": "/dev/termination-log",
      "io.kubernetes.pod.name": "test-701078429-di1pa",
      "io.kubernetes.pod.namespace": "default",
      "io.kubernetes.pod.terminationGracePeriod": "30",
      "io.kubernetes.pod.uid": "8545e949-aa8e-11e6-a455-52540089405b"
    }
  },
  "Image": "sha256:01176385d84aeb1d40ed18c6d3f952abf40d2d2b4aa98fcf0a8a4b01010fb9a9",
  "NetworkSettings": {
    "Bridge": "",
    "SandboxID": "",
    "HairpinMode": false,
    "LinkLocalIPv6Address": "",
    "LinkLocalIPv6PrefixLen": 0,
    "Networks": null,
    "Service": null,
    "Ports": null,
    "SandboxKey": "",
    "SecondaryIPAddresses": null,
    "SecondaryIPv6Addresses": null,
    "IsAnonymousEndpoint": false
  },
  "LogPath": "/var/lib/docker/containers/c382156b214945192ec765e2c6d85704b887885adb9f2e488e6e7de904817f11/c382156b214945192ec765e2c6d85704b887885adb9f2e488e6e7de904817f11-json.log",
  "Name": "/k8s_test.5aff3a15_test-701078429-di1pa_default_8545e949-aa8e-11e6-a455-52540089405b_317b59b9",
  "Driver": "devicemapper",
  "MountLabel": "",
  "ProcessLabel": "",
  "RestartCount": 0,
  "HasBeenStartedBefore": false,
  "HasBeenManuallyStopped": false,
  "MountPoints": {
    "/dev/termination-log": {
      "Source": "/tmp/mesos/slaves/9ee5ada4-4891-4c40-876b-88ad39002465-S0/frameworks/4826e827-c905-44e2-a326-4428b21ea866-0000/executors/1869bc29f9302f40_k8sm-executor/runs/ed787748-3c84-4de4-9acc-e8407d4aa526/pods/8545e949-aa8e-11e6-a455-52540089405b/containers/test/317b59b9",
      "Destination": "/dev/termination-log",
      "RW": true,
      "Name": "",
      "Driver": "",
      "Relabel": "",
      "Propagation": "rprivate",
      "Named": false,
      "ID": ""
    },
    "/etc/hosts": {
      "Source": "/tmp/mesos/slaves/9ee5ada4-4891-4c40-876b-88ad39002465-S0/frameworks/4826e827-c905-44e2-a326-4428b21ea866-0000/executors/1869bc29f9302f40_k8sm-executor/runs/ed787748-3c84-4de4-9acc-e8407d4aa526/pods/8545e949-aa8e-11e6-a455-52540089405b/etc-hosts",
      "Destination": "/etc/hosts",
      "RW": true,
      "Name": "",
      "Driver": "",
      "Relabel": "",
      "Propagation": "rprivate",
      "Named": false,
      "ID": ""
    },
    "/var/run/secrets/kubernetes.io/serviceaccount": {
      "Source": "/tmp/mesos/slaves/9ee5ada4-4891-4c40-876b-88ad39002465-S0/frameworks/4826e827-c905-44e2-a326-4428b21ea866-0000/executors/1869bc29f9302f40_k8sm-executor/runs/ed787748-3c84-4de4-9acc-e8407d4aa526/pods/8545e949-aa8e-11e6-a455-52540089405b/volumes/kubernetes.io~secret/default-token-routl",
      "Destination": "/var/run/secrets/kubernetes.io/serviceaccount",
      "RW": false,
      "Name": "",
      "Driver": "",
      "Relabel": "ro",
      "Propagation": "rprivate",
      "Named": false,
      "ID": ""
    }
  },
  "AppArmorProfile": "",
  "HostnamePath": "/var/lib/docker/containers/ff94643e4eb2334ea59baebe3fbf7c02e2db7989bf32301a8d567eeaa8953b71/hostname",
  "HostsPath": "/tmp/mesos/slaves/9ee5ada4-4891-4c40-876b-88ad39002465-S0/frameworks/4826e827-c905-44e2-a326-4428b21ea866-0000/executors/1869bc29f9302f40_k8sm-executor/runs/ed787748-3c84-4de4-9acc-e8407d4aa526/pods/8545e949-aa8e-11e6-a455-52540089405b/etc-hosts",
  "ShmPath": "/var/lib/docker/containers/ff94643e4eb2334ea59baebe3fbf7c02e2db7989bf32301a8d567eeaa8953b71/shm",
  "ResolvConfPath": "/var/lib/docker/containers/ff94643e4eb2334ea59baebe3fbf7c02e2db7989bf32301a8d567eeaa8953b71/resolv.conf",
  "SeccompProfile": "unconfined",
  "NoNewPrivileges": false
}

This is all automated using Chef and the IP addresses have been substituted for env variables for the time being to make testing a bit easier

Does --cluster-dns=10.10.10.10 in km-scheduler works for you?
Can your slave download executable km?
Anyway I used --cluster-dns=${KUBERNETES_MASTER_IP}.
Also I'm not sure if any of extra --admission-control options breaks anything here - I just used what I needed - ServiceAccount,DefaultStorageClass,AlwaysAdmit

I have not managed to get the cluster DNS working as it requires the token in the serviceaccount directory so have not yet looked at this option. Why do you think this might work (I will give it a try)?

Yes, km is download by all slaves with no problem.

I have changed the --admission-control options to those you use but no joy

#kubectl get pod
NAME                   READY     STATUS    RESTARTS   AGE
test-701078429-zgs5q   1/1       Running   0          2m
#kubectl exec test-701078429-zgs5q  ls /var/run/secrets/kubernetes.io/serviceaccount
#

My DNS didn't work properly (slaves could not download km), so I just switched to MASTER as a DNS server, so master.mesos name can be resolvable.

Do you know what software actually injects the config into Docker and does the mounts when Kubernetes is running on Mesos. I've been looking through the git repository but not yet landed on anything that I can inspect o allow me to understand what the process is.

maybe check vendor/k8s.io/kubernetes/pkg/kubelet/dockertools/docker_manager.go for permissions, also check docker daemon options (storage device etc. I use overlay)

I may have some understanding issues here but......

After creating a simple pod I can go the node and inspect the config.v2.json file, which I think is created by docker, contains the following mount points:

  "MountPoints": {
    "/dev/termination-log": {
      "Source": "/tmp/mesos/slaves/9ee5ada4-4891-4c40-876b-88ad39002465-S0/frameworks/4826e827-c905-44e2-a326-4428b21ea866-0000/executors/1869bc29f9302f40_k8sm-executor/runs/2b1a6431-3f81-4fa0-a3d5-340ac30d3f93/pods/42983f58-ace5-11e6-8b52-52540089405b/containers/test/95a6fcc8",
      "Destination": "/dev/termination-log",
      "RW": true,
      "Name": "",
      "Driver": "",
      "Relabel": "",
      "Propagation": "rprivate",
      "Named": false,
      "ID": ""
    },
    "/etc/hosts": {
      "Source": "/tmp/mesos/slaves/9ee5ada4-4891-4c40-876b-88ad39002465-S0/frameworks/4826e827-c905-44e2-a326-4428b21ea866-0000/executors/1869bc29f9302f40_k8sm-executor/runs/2b1a6431-3f81-4fa0-a3d5-340ac30d3f93/pods/42983f58-ace5-11e6-8b52-52540089405b/etc-hosts",
      "Destination": "/etc/hosts",
      "RW": true,
      "Name": "",
      "Driver": "",
      "Relabel": "",
      "Propagation": "rprivate",
      "Named": false,
      "ID": ""
    },
    "/var/run/secrets/kubernetes.io/serviceaccount": {
      "Source": "/tmp/mesos/slaves/9ee5ada4-4891-4c40-876b-88ad39002465-S0/frameworks/4826e827-c905-44e2-a326-4428b21ea866-0000/executors/1869bc29f9302f40_k8sm-executor/runs/2b1a6431-3f81-4fa0-a3d5-340ac30d3f93/pods/42983f58-ace5-11e6-8b52-52540089405b/volumes/kubernetes.io~secret/default-token-routl",
      "Destination": "/var/run/secrets/kubernetes.io/serviceaccount",
      "RW": false,
      "Name": "",
      "Driver": "",
      "Relabel": "ro",
      "Propagation": "rprivate",
      "Named": false,
      "ID": ""
    }
  }

The /var/run/secrets/kubernetes.io/serviceaccount is a MountPoint . If I look at the source it does have the files

# ls -al /tmp/mesos/slaves/9ee5ada4-4891-4c40-876b-88ad39002465-S0/frameworks/4826e827-c905-44e2-a326-4428b21ea866-0000/executors/1869bc29f9302f40_k8sm-executor/runs/2b1a6431-3f81-4fa0-a3d5-340ac30d3f93/pods/42983f58-ace5-11e6-8b52-52540089405b/volumes/kubernetes.io~secret/default-token-routl
total 0
drwxrwxrwt 3 root root 140 Nov 17 16:44 .
drwxr-xr-x 3 root root  32 Nov 17 16:44 ..
drwxr-xr-x 2 root root 100 Nov 17 16:44 ..119811_17_11_16_44_15.931783157
lrwxrwxrwx 1 root root  13 Nov 17 16:44 ca.crt -> ..data/ca.crt
lrwxrwxrwx 1 root root  33 Nov 17 16:44 ..data -> ..119811_17_11_16_44_15.931783157
lrwxrwxrwx 1 root root  16 Nov 17 16:44 namespace -> ..data/namespace
lrwxrwxrwx 1 root root  12 Nov 17 16:44 token -> ..data/token

However these are not available in the destination of the mount.

yeah, docker inspect always shows correct mount points, but the problem is in synchronization (or unmounted volume). btw. what's your docker version and linux kernel?

Docker version 1.12.1, build 23cf638

We have more than one type of linux node covering CentOS7 and Ubuntu 14 and all do the same thing so I doubt it's a linux kernel related issue.

Do you use any public ami? Maybe I can try to set it up and play a little bit.

Sorry but no

The pod executor.log is being filled up the following message

I1119 07:39:38.312297   31246 operation_executor.go:802] MountVolume.SetUp succeeded for volume "kubernetes.io/secret/42983f58-ace5-11e6-8b52-52540089405b-default-token-routl" (spec.Name: "default-token-routl") pod "42983f58-ace5-11e6-8b52-52540089405b" (UID: "42983f58-ace5-11e6-8b52-52540089405b").

With this occurring at about 30 second intervals.

The km process started the executor with the following parameters

executor --api-servers=192.168.1.35:8888 --v=0 --allow-privileged=false --suicide-timeout=20m0s --mesos-launch-grace-period=5m0s --cadvisor-port=4194 --sync-frequency=10s --enable-debugging-handlers=true --cluster-dns=10.10.10.10 --cluster-domain=cluster.local --hostname-override=node1.crml.com --kubelet-cgroups= --cgroup-root=/mesos/2b1a6431-3f81-4fa0-a3d5-340ac30d3f93 --housekeeping_interval=10s --global_housekeeping_interval=1m0s

yeah, generally it mounts and unmounts...
but even when I changed k8s code (not to remount) it didn't work for me on dcos. but on my tiny cluster it works without any changes. btw. do you use kubernetes from this repo?

it has been built from the kubernetes guthub at version

Client Version: version.Info{Major:"1", Minor:"4", GitVersion:"v1.4.1+33cf7b9", GitCommit:"33cf7b9acbb2cb7c9c72a10d6636321fb180b159", GitTreeState:"not a git tree", BuildDate:"2016-10-20T18:37:49Z", GoVersion:"go1.7.1", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"4", GitVersion:"v1.4.1+33cf7b9", GitCommit:"33cf7b9acbb2cb7c9c72a10d6636321fb180b159", GitTreeState:"not a git tree", BuildDate:"2016-10-20T18:37:49Z", GoVersion:"go1.7.1", Compiler:"gc", Platform:"linux/amd64"}

I've switched to this repo. check it out

hmm that's weird, I've just set up golang and simple make did the job.

On Nov 20, 2016 8:26 AM, "robin-hunter" notifications@github.com wrote:

The repo will not build as the make has too many errors that include

net(.text): direct call too far -8874093, should build with -gcflags
-largemodel
os/user(.text): direct call too far -8874700, should build with -gcflags
-largemodel
/usr/local/go_k8s_patched/pkg/tool/linux_amd64/link: too many errors

I have not yet looked to resolve but this ought to be corrected in the repo


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#48 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AD3lyQwPenlDumvZYyOxTm6nXUlV7Qeaks5q__Y2gaJpZM4KYxIO
.

I used make release to start to get all executables. With just make I get km, apiserver, controller and scheduler but no kubectl or perhaps importantly kubelet (hence why I tried make release)

I have used these four executables in place of the main kubernetes branch version but there is no difference which is perhaps what I might have expected without a new kubelet

k82cn commented

I used make release to start to get all executables. With just make I get km, apiserver, controller and scheduler but no kubectl or perhaps importantly kubelet (hence why I tried make release)

That's the expected result; kubectl can be got from upstream, kubelete was built into km.

I now have everything built and configured as intended, but the serviceaccounts contents are still not being populated as expected

so maybe check how your docker daemon was started?

service docker start

what parameters, storage device, etc.:
ps auwx | grep docker

root       888  0.9  0.7 843916 57688 ?        Ssl  Oct01 718:37 dockerd --pidfile=/var/run/docker.pid
root      1001  0.0  0.1 334268 11144 ?        Ssl  Oct01  18:15 docker-containerd -l unix:///var/run/docker/libcontainerd/docker-containerd.sock --shim docker-containerd-shim --metrics-interval=0 --start-timeout 2m --state-dir /var/run/docker/libcontainerd/containerd --runtime docker-runc

The executor log now contains an error with these message being written continuously

E1120 18:32:18.839268   28182 pod_workers.go:184] Error syncing pod 2dff4d7d-ae59-11e6-8b52-52540089405b, skipping: failed to "StartContainer" for "kube-state-metrics" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kube-state-metrics pod=kube-state-metrics-deployment-830856031-yl5mm_monitoring(2dff4d7d-ae59-11e6-8b52-52540089405b)"
I1120 18:32:21.588285   28182 reconciler.go:299] MountVolume operation started for volume "kubernetes.io/configmap/2d384586-ae59-11e6-8b52-52540089405b-config-volume" (spec.Name: "config-volume") to pod "2d384586-ae59-11e6-8b52-52540089405b" (UID: "2d384586-ae59-11e6-8b52-52540089405b"). Volume is already mounted to pod, but remount was requested.
I1120 18:32:21.588419   28182 reconciler.go:299] MountVolume operation started for volume "kubernetes.io/secret/2d384586-ae59-11e6-8b52-52540089405b-default-token-zspt3" (spec.Name: "default-token-zspt3") to pod "2d384586-ae59-11e6-8b52-52540089405b" (UID: "2d384586-ae59-11e6-8b52-52540089405b"). Volume is already mounted to pod, but remount was requested.
I1120 18:32:21.594160   28182 operation_executor.go:802] MountVolume.SetUp succeeded for volume "kubernetes.io/secret/2d384586-ae59-11e6-8b52-52540089405b-default-token-zspt3" (spec.Name: "default-token-zspt3") pod "2d384586-ae59-11e6-8b52-52540089405b" (UID: "2d384586-ae59-11e6-8b52-52540089405b").
I1120 18:32:21.624636   28182 operation_executor.go:802] MountVolume.SetUp succeeded for volume "kubernetes.io/configmap/2d384586-ae59-11e6-8b52-52540089405b-config-volume" (spec.Name: "config-volume") pod "2d384586-ae59-11e6-8b52-52540089405b" (UID: "2d384586-ae59-11e6-8b52-52540089405b").
I1120 18:32:24.593624   28182 reconciler.go:299] MountVolume operation started for volume "kubernetes.io/secret/a57866d1-af4e-11e6-8853-52540089405b-default-token-routl" (spec.Name: "default-token-routl") to pod "a57866d1-af4e-11e6-8853-52540089405b" (UID: "a57866d1-af4e-11e6-8853-52540089405b"). Volume is already mounted to pod, but remount was requested.
I1120 18:32:24.595629   28182 operation_executor.go:802] MountVolume.SetUp succeeded for volume "kubernetes.io/secret/a57866d1-af4e-11e6-8853-52540089405b-default-token-routl" (spec.Name: "default-token-routl") pod "a57866d1-af4e-11e6-8853-52540089405b" (UID: "a57866d1-af4e-11e6-8853-52540089405b").
I1120 18:32:30.603537   28182 reconciler.go:299] MountVolume operation started for volume "kubernetes.io/secret/2dff4d7d-ae59-11e6-8b52-52540089405b-default-token-zspt3" (spec.Name: "default-token-zspt3") to pod "2dff4d7d-ae59-11e6-8b52-52540089405b" (UID: "2dff4d7d-ae59-11e6-8b52-52540089405b"). Volume is already mounted to pod, but remount was requested.
I1120 18:32:30.605496   28182 operation_executor.go:802] MountVolume.SetUp succeeded for volume "kubernetes.io/secret/2dff4d7d-ae59-11e6-8b52-52540089405b-default-token-zspt3" (spec.Name: "default-token-zspt3") pod "2dff4d7d-ae59-11e6-8b52-52540089405b" (UID: "2dff4d7d-ae59-11e6-8b52-52540089405b").
I1120 18:32:30.839091   28182 docker_manager.go:2443] checking backoff for container "kube-state-metrics" in pod "kube-state-metrics-deployment-830856031-yl5mm"
I1120 18:32:30.839173   28182 docker_manager.go:2457] Back-off 5m0s restarting failed container=kube-state-metrics pod=kube-state-metrics-deployment-830856031-yl5mm_monitoring(2dff4d7d-ae59-11e6-8b52-52540089405b)

It does look like the establishment of the container is incorrect but I have still not found where in the code the container is instantiated

Yeah, this is normal. I saw that many times. Moreover, I've modified k8s code to avoid remounts, but on dcos it didn't help.

Maybe it's an issue with docker's storage device + filesystem.

On Sun, Nov 20, 2016 at 7:34 PM, robin-hunter notifications@github.com
wrote:

The executor log now contains an error with these message being written
continuously

E1120 18:32:18.839268 28182 pod_workers.go:184] Error syncing pod 2dff4d7d-ae59-11e6-8b52-52540089405b, skipping: failed to "StartContainer" for "kube-state-metrics" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kube-state-metrics pod=kube-state-metrics-deployment-830856031-yl5mm_monitoring(2dff4d7d-ae59-11e6-8b52-52540089405b)"
I1120 18:32:21.588285 28182 reconciler.go:299] MountVolume operation started for volume "kubernetes.io/configmap/2d384586-ae59-11e6-8b52-52540089405b-config-volume" (spec.Name: "config-volume") to pod "2d384586-ae59-11e6-8b52-52540089405b" (UID: "2d384586-ae59-11e6-8b52-52540089405b"). Volume is already mounted to pod, but remount was requested.
I1120 18:32:21.588419 28182 reconciler.go:299] MountVolume operation started for volume "kubernetes.io/secret/2d384586-ae59-11e6-8b52-52540089405b-default-token-zspt3" (spec.Name: "default-token-zspt3") to pod "2d384586-ae59-11e6-8b52-52540089405b" (UID: "2d384586-ae59-11e6-8b52-52540089405b"). Volume is already mounted to pod, but remount was requested.
I1120 18:32:21.594160 28182 operation_executor.go:802] MountVolume.SetUp succeeded for volume "kubernetes.io/secret/2d384586-ae59-11e6-8b52-52540089405b-default-token-zspt3" (spec.Name: "default-token-zspt3") pod "2d384586-ae59-11e6-8b52-52540089405b" (UID: "2d384586-ae59-11e6-8b52-52540089405b").
I1120 18:32:21.624636 28182 operation_executor.go:802] MountVolume.SetUp succeeded for volume "kubernetes.io/configmap/2d384586-ae59-11e6-8b52-52540089405b-config-volume" (spec.Name: "config-volume") pod "2d384586-ae59-11e6-8b52-52540089405b" (UID: "2d384586-ae59-11e6-8b52-52540089405b").
I1120 18:32:24.593624 28182 reconciler.go:299] MountVolume operation started for volume "kubernetes.io/secret/a57866d1-af4e-11e6-8853-52540089405b-default-token-routl" (spec.Name: "default-token-routl") to pod "a57866d1-af4e-11e6-8853-52540089405b" (UID: "a57866d1-af4e-11e6-8853-52540089405b"). Volume is already mounted to pod, but remount was requested.
I1120 18:32:24.595629 28182 operation_executor.go:802] MountVolume.SetUp succeeded for volume "kubernetes.io/secret/a57866d1-af4e-11e6-8853-52540089405b-default-token-routl" (spec.Name: "default-token-routl") pod "a57866d1-af4e-11e6-8853-52540089405b" (UID: "a57866d1-af4e-11e6-8853-52540089405b").
I1120 18:32:30.603537 28182 reconciler.go:299] MountVolume operation started for volume "kubernetes.io/secret/2dff4d7d-ae59-11e6-8b52-52540089405b-default-token-zspt3" (spec.Name: "default-token-zspt3") to pod "2dff4d7d-ae59-11e6-8b52-52540089405b" (UID: "2dff4d7d-ae59-11e6-8b52-52540089405b"). Volume is already mounted to pod, but remount was requested.
I1120 18:32:30.605496 28182 operation_executor.go:802] MountVolume.SetUp succeeded for volume "kubernetes.io/secret/2dff4d7d-ae59-11e6-8b52-52540089405b-default-token-zspt3" (spec.Name: "default-token-zspt3") pod "2dff4d7d-ae59-11e6-8b52-52540089405b" (UID: "2dff4d7d-ae59-11e6-8b52-52540089405b").
I1120 18:32:30.839091 28182 docker_manager.go:2443] checking backoff for container "kube-state-metrics" in pod "kube-state-metrics-deployment-830856031-yl5mm"
I1120 18:32:30.839173 28182 docker_manager.go:2457] Back-off 5m0s restarting failed container=kube-state-metrics pod=kube-state-metrics-deployment-830856031-yl5mm_monitoring(2dff4d7d-ae59-11e6-8b52-52540089405b)


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#48 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AD3lyai7ZGIRAWKE2s4N6GxBHzSd7rqBks5rAJKYgaJpZM4KYxIO
.

k82cn commented

@kuba-- / @robin-hunter , how about your output of kubectl get secrets ?

# kubectl get secrets
NAME                  TYPE                                  DATA      AGE
default-token-routl   kubernetes.io/service-account-token   3         13d
# kubectl describe secrets
Name:           default-token-routl
Namespace:      default
Labels:         <none>
Annotations:    kubernetes.io/service-account.name=default
                kubernetes.io/service-account.uid=41459909-93cc-11e6-95e1-52540089405b

Type:   kubernetes.io/service-account-token

Data
====
ca.crt:         1094 bytes
namespace:      7 bytes
token:          eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJkZWZhdWx0Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZWNyZXQubmFtZSI6ImRlZmF1bHQtdG9rZW4tcm91dGwiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC5uYW1lIjoiZGVmYXVsdCIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50LnVpZCI6IjQxNDU5OTA5LTkzY2MtMTFlNi05NWUxLTUyNTQwMDg5NDA1YiIsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDpkZWZhdWx0OmRlZmF1bHQifQ.CFpaCH0ZThwaG8gcClZUVfTm-SCx3DLotv9sfYM-oy0fdPRi-IkRJVLb98RsJg8W11yFF1956gC3kFvA9YUi5swO9XMrdKdAn1mY6091zTE2gAPY5UwoLDT79QdWVm74CHQTh9sI0plF86zTxSUf1JLcqbOuarbvmbp-6voq8EfMZBCmfrUKMnANbiB09hBAVjp-yUlM3jn2Yyfpc_Bqi0yQKNlDI8oRYbavogz5Wfl4ef8GgfR4tIH4hzCOCSfTieVGCmRR_WCTjjRHJWb0xRaLGeWaabvKVVXgp-1dPAEVZdTdOySaPWoBgy-Zd_oUL1A_PIBudmQXGNqX2LXTpg

Of course, secrets are logical pods, so it will exist in memory even if you do not create any physical pod.

k82cn commented

@kuba-- / @robin-hunter , it seems works in my test environment; I used docker-compose environment, please refer here for the detail.

root@a4d01210a8c2:/# docker ps
CONTAINER ID        IMAGE                                      COMMAND                  CREATED             STATUS              PORTS               NAMES
ebf27cf974f6        nginx                                      "nginx -g 'daemon off"   5 minutes ago       Up 5 minutes                            k8s_mynginx.ebb36d1_mynginx-1358273036-wki8i_default_4e86614c-b04c-11e6-add3-0242ac160005_ecacc0b9
865a7b8d1f9e        gcr.io/google_containers/pause-amd64:3.0   "/pause"                 6 minutes ago       Up 6 minutes                            k8s_POD.12240a74_mynginx-1358273036-wki8i_default_4e86614c-b04c-11e6-add3-0242ac160005_15c258c6
root@a4d01210a8c2:/# docker exec ebf27cf974f6 ls /var/run/secrets/kubernetes.io/serviceaccount
ca.crt
namespace
token
root@a4d01210a8c2:/#

@k82cn - do you have DCOS environment/configuration where it works?
@robin-hunter - does it finally work for you?

k82cn commented

@kuba-- , I did not have DCOS env; let me have a try later.

Sorry holidays - I can't get at the docker compose (404) so cannot try. This is not on DCOS but on Apache Mesos version 1.0.1

@xinxian0458 (Jie Zhang) sold me a hint, y'day. On his environment (DCOS) he modified MESOS_ISOLATION in /opt/mesosphere/etc/mesos-slave-common file.
Basically he removed filesystem/linux and restarted mesos.
I didn't try it, yet, but hopefully it can fix DC/OS.

I am not using DCOS (mesosphere). The value being used currently is MESOS_ISOLATION=cgroups/cpu,cgroups/mem with no filesystem/linux component

For me personally it works on DC/OS 1.8 after I modify (remove filesystem/linux) MESOS_ISOLATION in /opt/mesosphere/etc/mesos-slave-common and /opt/mesosphere/etc/mesos-slave-common-extras

I'm using Apache Mesos 1.0.1 and the filesystem/linux value is not included as part of the MESOS_ISOLATION option either as a command line argument or as part of a config file.

The fact that it works with DCOS and not Apache may suggest this is how the Docker container is configured but I have still not located where this is being done so that I can look at the code. I don't have any debug environment so rely on looking through the code. Does anybody have any suggestions as to where to look? This would allow me to modify my 'local' copy of the code base and add logging etc. to help isolate.

I have been digging around in the mesos 1.1.0 code and have found that the isolation level is forced to filesystem/linux

mesos.INFO log file output

I1214 06:23:12.064366  8160 containerizer.cpp:200] Using isolation: posix/cpu,posix/mem,filesystem/posix,network/cni

also the containerizer code

  // One and only one filesystem isolator is required. The filesystem
  // isolator is responsible for preparing the filesystems for
  // containers (e.g., prepare filesystem roots, volumes, etc.). If
  // the user does not specify one, 'filesystem/posix' (or
  // 'filesystem/windows' on Windows) will be used
  //
  // TODO(jieyu): Check that only one filesystem isolator is used.
  if (!strings::contains(flags_.isolation, "filesystem/")) {
#ifndef __WINDOWS__
    flags_.isolation += ",filesystem/posix";
#else
    flags_.isolation += ",filesystem/windows";
#endif // !__WINDOWS__
  }

This may be the underlying cause but I am not sure what the filesystem/linux isolation is doing or why it might be the cause of the problem

k82cn commented

@robin-hunter , great job :). I think we can post an email to Mesos community.

jdef commented

I have still not managed to isolate this problem and have looked at many aspects of the stack. In our environment we have both Ubuntu 15 and CentOS 7 nodes, although I have been mainly using CentOS7 hence why I have only recently found that on Ubuntu 15 nodes we do get the serviceaccount files:

# kubectl get po
NAME                    READY     STATUS    RESTARTS   AGE
test-1288018797-tf9h9   1/1       Running   2          14h

# kubectl exec test-1288018797-tf9h9 ls /var/run/secrets/kubernetes.io/serviceaccount
ca.crt
namespace
token

These files are not populated on CentOS7 and I am currently looking into docker mount propagation rprivate vs rshare as a similar problem has been reported and not yet resolved on OpenShift, which incidentally will be running RHEL.

The main difference between Ubuntu and CentOS is the GraphDriver (used to map files), which is AUFS in Ubuntu and devicemapper on CentOS. These two mechanism are fundamentally different.

The problem does appear to be tied with the underlying operating system that docker is running on and not the filesystem that docker is using. I still get no files in the serviceaccount with CentOS7 running btrfs as the GraphDriver

        "GraphDriver": {
            "Name": "btrfs",
            "Data": null
        },
        "Mounts": [
            {
                "Source": "/tmp/mesos/slaves/54a9af5f-50ab-4b16-9671-deb970959c09-S0/frameworks/4826e827-c905-44e2-a326-4428b21ea866-0000/executors/1869bc29f9302f40_k8sm-executor/runs/541037c4-b673-474f-ad7a-402e6f3f8b5e/pods/7f5cea54-ce61-11e6-99c7-52540089405b/volumes/kubernetes.io~secret/default-token-routl",
                "Destination": "/var/run/secrets/kubernetes.io/serviceaccount",
                "Mode": "ro",
                "RW": false,
                "Propagation": "rprivate"
            },
            {
                "Source": "/tmp/mesos/slaves/54a9af5f-50ab-4b16-9671-deb970959c09-S0/frameworks/4826e827-c905-44e2-a326-4428b21ea866-0000/executors/1869bc29f9302f40_k8sm-executor/runs/541037c4-b673-474f-ad7a-402e6f3f8b5e/pods/7f5cea54-ce61-11e6-99c7-52540089405b/etc-hosts",
                "Destination": "/etc/hosts",
                "Mode": "",
                "RW": true,
                "Propagation": "rprivate"
            },
            {
                "Source": "/tmp/mesos/slaves/54a9af5f-50ab-4b16-9671-deb970959c09-S0/frameworks/4826e827-c905-44e2-a326-4428b21ea866-0000/executors/1869bc29f9302f40_k8sm-executor/runs/541037c4-b673-474f-ad7a-402e6f3f8b5e/pods/7f5cea54-ce61-11e6-99c7-52540089405b/containers/test/c492c7b3",
                "Destination": "/dev/termination-log",
                "Mode": "",
                "RW": true,
                "Propagation": "rprivate"
            }