rimusz/coreos-kubernetes-cluster-osx

error when trying to load fleet files

jchauncey opened this issue · 16 comments

Installing k8s files to master and nodes:
Warning: Permanently added '[127.0.0.1]:2222' (RSA) to the list of known hosts.
master.tgz                                                       100%   23MB  23.0MB/s   00:00
Warning: Permanently added '[127.0.0.1]:2200' (RSA) to the list of known hosts.
nodes.tgz                                                        100%   16MB  16.1MB/s   00:00
Warning: Permanently added '[127.0.0.1]:2201' (RSA) to the list of known hosts.
nodes.tgz                                                        100%   16MB  16.1MB/s   00:00
Done installing ...

Installing fleet units:
Unable to initialize client: failed initializing SSH client: timed out while initiating SSH connection

Waiting for Kubernetes cluster to be ready. This can take a few minutes...
error: couldn't read version from server: Get http://172.17.15.101:8080/api: dial tcp 172.17.15.101:8080: connection refused

I think part of the problem is that i have functions that set fleetctl_tunnel to other hosts and that gets loaded when I start zsh. So there is probably interference there.

@jchauncey yes that might be your issue, but zsh shell should not interfere with bash shell used in the app

Last login: Tue Nov 10 11:13:35 on ttys003
╭─jonathanchauncey at ENG000637 in ~ using ‹2.2.2›
╰─○ /Applications/CoreOS\ k8s\ Cluster.app/Contents/Resources/vagrant_up.command; exit;
==> k8smaster-01: Checking for updates to 'coreos-stable'
    k8smaster-01: Latest installed version: 647.2.0
    k8smaster-01: Version constraints:
    k8smaster-01: Provider: virtualbox
==> k8smaster-01: Box 'coreos-stable' (v647.2.0) is running the latest version.
Bringing machine 'k8smaster-01' up with 'virtualbox' provider...
==> k8smaster-01: Importing base box 'coreos-stable'...
==> k8smaster-01: Matching MAC address for NAT networking...
==> k8smaster-01: Checking if box 'coreos-stable' is up to date...
==> k8smaster-01: Setting the name of the VM: control_k8smaster-01_1447179043537_41262
==> k8smaster-01: Clearing any previously set network interfaces...
==> k8smaster-01: Preparing network interfaces based on configuration...
    k8smaster-01: Adapter 1: nat
    k8smaster-01: Adapter 2: hostonly
==> k8smaster-01: Forwarding ports...
    k8smaster-01: 22 => 2222 (adapter 1)
==> k8smaster-01: Running 'pre-boot' VM customizations...
==> k8smaster-01: Booting VM...
==> k8smaster-01: Waiting for machine to boot. This may take a few minutes...
    k8smaster-01: SSH address: 127.0.0.1:2222
    k8smaster-01: SSH username: core
    k8smaster-01: SSH auth method: private key
    k8smaster-01: Warning: Connection timeout. Retrying...
==> k8smaster-01: Machine booted and ready!
==> k8smaster-01: Setting hostname...
==> k8smaster-01: Configuring and enabling network interfaces...
==> k8smaster-01: Running provisioner: file...
==> k8smaster-01: Running provisioner: shell...
    k8smaster-01: Running: inline script
==> k8snode-01: Checking for updates to 'coreos-stable'
    k8snode-01: Latest installed version: 647.2.0
    k8snode-01: Version constraints:
    k8snode-01: Provider: virtualbox
==> k8snode-01: Box 'coreos-stable' (v647.2.0) is running the latest version.
==> k8snode-02: Checking for updates to 'coreos-stable'
    k8snode-02: Latest installed version: 647.2.0
    k8snode-02: Version constraints:
    k8snode-02: Provider: virtualbox
==> k8snode-02: Box 'coreos-stable' (v647.2.0) is running the latest version.
Bringing machine 'k8snode-01' up with 'virtualbox' provider...
Bringing machine 'k8snode-02' up with 'virtualbox' provider...
==> k8snode-01: Importing base box 'coreos-stable'...
==> k8snode-01: Matching MAC address for NAT networking...
==> k8snode-01: Checking if box 'coreos-stable' is up to date...
==> k8snode-01: Setting the name of the VM: workers_k8snode-01_1447179068143_43632
==> k8snode-01: Fixed port collision for 22 => 2222. Now on port 2200.
==> k8snode-01: Clearing any previously set network interfaces...
==> k8snode-01: Preparing network interfaces based on configuration...
    k8snode-01: Adapter 1: nat
    k8snode-01: Adapter 2: hostonly
==> k8snode-01: Forwarding ports...
    k8snode-01: 22 => 2200 (adapter 1)
==> k8snode-01: Running 'pre-boot' VM customizations...
==> k8snode-01: Booting VM...
==> k8snode-01: Waiting for machine to boot. This may take a few minutes...
    k8snode-01: SSH address: 127.0.0.1:2200
    k8snode-01: SSH username: core
    k8snode-01: SSH auth method: private key
==> k8snode-01: Machine booted and ready!
==> k8snode-01: Setting hostname...
==> k8snode-01: Configuring and enabling network interfaces...
==> k8snode-01: Running provisioner: file...
==> k8snode-01: Running provisioner: shell...
    k8snode-01: Running: inline script
==> k8snode-02: Importing base box 'coreos-stable'...
==> k8snode-02: Matching MAC address for NAT networking...
==> k8snode-02: Checking if box 'coreos-stable' is up to date...
==> k8snode-02: Setting the name of the VM: workers_k8snode-02_1447179080010_39744
==> k8snode-02: Fixed port collision for 22 => 2222. Now on port 2201.
==> k8snode-02: Clearing any previously set network interfaces...
==> k8snode-02: Preparing network interfaces based on configuration...
    k8snode-02: Adapter 1: nat
    k8snode-02: Adapter 2: hostonly
==> k8snode-02: Forwarding ports...
    k8snode-02: 22 => 2201 (adapter 1)
==> k8snode-02: Running 'pre-boot' VM customizations...
==> k8snode-02: Booting VM...
==> k8snode-02: Waiting for machine to boot. This may take a few minutes...
    k8snode-02: SSH address: 127.0.0.1:2201
    k8snode-02: SSH username: core
    k8snode-02: SSH auth method: private key
    k8snode-02: Warning: Connection timeout. Retrying...
==> k8snode-02: Machine booted and ready!
==> k8snode-02: Setting hostname...
==> k8snode-02: Configuring and enabling network interfaces...
==> k8snode-02: Running provisioner: file...
==> k8snode-02: Running provisioner: shell...
    k8snode-02: Running: inline script

Installing k8s files to master and nodes:
Warning: Permanently added '[127.0.0.1]:2222' (RSA) to the list of known hosts.
master.tgz                                                       100%   23MB  23.0MB/s   00:00
Warning: Permanently added '[127.0.0.1]:2200' (RSA) to the list of known hosts.
nodes.tgz                                                        100%   16MB  16.1MB/s   00:00
Warning: Permanently added '[127.0.0.1]:2201' (RSA) to the list of known hosts.
nodes.tgz                                                        100%   16MB  16.1MB/s   00:00
Done installing ...

Installing fleet units:
2015/11/10 13:11:52 INFO client.go:291: Failed getting response from http://172.17.15.101:2379/: dial tcp 172.17.15.101:2379: connection refused
2015/11/10 13:11:52 ERROR client.go:213: Unable to get result for {Get /_coreos.com/fleet/machines}, retrying in 100ms

nope that didnt fix it

@jchauncey can you drop here your env output?

SSH_AUTH_SOCK=/private/tmp/com.apple.launchd.H05LKRZxX7/Listeners
Apple_PubSub_Socket_Render=/private/tmp/com.apple.launchd.zl847TvvLw/Render
COLORFGBG=7;0
ITERM_PROFILE=Default
XPC_FLAGS=0x0
LANG=en_US.UTF-8
PWD=/Users/jonathanchauncey
SHELL=/bin/zsh
SECURITYSESSIONID=186a5
TERM_PROGRAM=iTerm.app
PATH=/Users/jonathanchauncey/.rbenv/shims:/Users/jonathanchauncey/.rbenv/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin::/Users/jonathanchauncey/go/bin:/usr/local/go/bin:/Users/jonathanchauncey/rigger/rigger
COMMAND_MODE=unix2003
TERM=xterm-256color
HOME=/Users/jonathanchauncey
TMPDIR=/var/folders/8b/12c6mgpx4n18mgzwv3216b380000gn/T/
USER=jonathanchauncey
XPC_SERVICE_NAME=0
LOGNAME=jonathanchauncey
__CF_USER_TEXT_ENCODING=0x1F5:0x0:0x0
ITERM_SESSION_ID=w0t1p0
SHLVL=1
OLDPWD=/Users/jonathanchauncey
ZSH=/Users/jonathanchauncey/.oh-my-zsh
PAGER=less
LESS=-R
LC_CTYPE=en_US.UTF-8
LSCOLORS=Gxfxcxdxbxegedabagacad
GOPATH=/Users/jonathanchauncey/go
ETCD_HOST=52.88.158.229
ETCD_PORT=4001
DOCKER_HOST=tcp://52.88.158.229:2375
GO15VENDOREXPERIMENT=1
_=/usr/bin/env

@jchauncey interesting the App's preset bash shell should able to take over your settings, maybe something is not working between bash and zsh there

@jchauncey could you run the App's install when you start getting those errors open via App's menu OS Shell and drops env output here again, thanks

╰─○ /Applications/CoreOS\ k8s\ Cluster.app/Contents/Resources/os_shell.command; exit;
etcd cluster:
Error:  501: All the given peers are not reachable (Tried to connect to each peer twice and failed) [0]

fleetctl list-machines:
2015/11/12 10:28:06 INFO client.go:291: Failed getting response from http://172.17.15.101:2379/: dial tcp 172.17.15.101:2379: connection refused
2015/11/12 10:28:06 ERROR client.go:213: Unable to get result for {Get /_coreos.com/fleet/machines}, retrying in 100ms
2015/11/12 10:28:06 INFO client.go:291: Failed getting response from http://172.17.15.101:2379/: dial tcp 172.17.15.101:2379: connection refused
2015/11/12 10:28:06 ERROR client.go:213: Unable to get result for {Get /_coreos.com/fleet/machines}, retrying in 200ms
2015/11/12 10:28:06 INFO client.go:291: Failed getting response from http://172.17.15.101:2379/: dial tcp 172.17.15.101:2379: connection refused
2015/11/12 10:28:06 ERROR client.go:213: Unable to get result for {Get /_coreos.com/fleet/machines}, retrying in 400ms
2015/11/12 10:28:06 INFO client.go:291: Failed getting response from http://172.17.15.101:2379/: dial tcp 172.17.15.101:2379: connection refused
2015/11/12 10:28:06 ERROR client.go:213: Unable to get result for {Get /_coreos.com/fleet/machines}, retrying in 800ms
^C
fleetctl list-units:
2015/11/12 10:28:07 INFO client.go:291: Failed getting response from http://172.17.15.101:2379/: dial tcp 172.17.15.101:2379: connection refused
2015/11/12 10:28:07 ERROR client.go:213: Unable to get result for {Get /_coreos.com/fleet/machines}, retrying in 100ms
2015/11/12 10:28:07 INFO client.go:291: Failed getting response from http://172.17.15.101:2379/: dial tcp 172.17.15.101:2379: connection refused
2015/11/12 10:28:07 ERROR client.go:213: Unable to get result for {Get /_coreos.com/fleet/machines}, retrying in 200ms
2015/11/12 10:28:07 INFO client.go:291: Failed getting response from http://172.17.15.101:2379/: dial tcp 172.17.15.101:2379: connection refused
2015/11/12 10:28:07 ERROR client.go:213: Unable to get result for {Get /_coreos.com/fleet/machines}, retrying in 400ms
^C
kubectl get nodes:
error: couldn't read version from server: Get http://172.17.15.101:8080/api: dial tcp 172.17.15.101:8080: connection refused


bash-3.2$
bash-3.2$ env
TERM_PROGRAM=iTerm.app
SHELL=/bin/zsh
TERM=xterm-256color
TMPDIR=/var/folders/8b/12c6mgpx4n18mgzwv3216b380000gn/T/
Apple_PubSub_Socket_Render=/private/tmp/com.apple.launchd.zl847TvvLw/Render
DOCKER_HOST=tcp://52.88.158.229:2375
FLEETCTL_ENDPOINT=http://172.17.15.101:2379
GO15VENDOREXPERIMENT=1
ZSH=/Users/jonathanchauncey/.oh-my-zsh
ETCDCTL_PEERS=http://172.17.15.101:2379
USER=jonathanchauncey
FLEETCTL_STRICT_HOST_KEY_CHECKING=false
COMMAND_MODE=unix2003
SSH_AUTH_SOCK=/private/tmp/com.apple.launchd.H05LKRZxX7/Listeners
__CF_USER_TEXT_ENCODING=0x1F5:0x0:0x0
PAGER=less
LSCOLORS=Gxfxcxdxbxegedabagacad
PATH=/Users/jonathanchauncey/coreos-k8s-cluster/bin:/Users/jonathanchauncey/.rbenv/shims:/Users/jonathanchauncey/.rbenv/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin::/Users/jonathanchauncey/go/bin:/usr/local/go/bin:/Users/jonathanchauncey/rigger/rigger
PWD=/Users/jonathanchauncey
LANG=en_US.UTF-8
ITERM_PROFILE=Default
XPC_FLAGS=0x0
FLEETCTL_DRIVER=etcd
XPC_SERVICE_NAME=0
COLORFGBG=7;0
HOME=/Users/jonathanchauncey
SHLVL=3
KUBERNETES_MASTER=http://172.17.15.101:8080
ITERM_SESSION_ID=w0t5p0
LOGNAME=jonathanchauncey
LESS=-R
LC_CTYPE=en_US.UTF-8
GOPATH=/Users/jonathanchauncey/go
ETCD_HOST=52.88.158.229
ETCD_PORT=4001
SECURITYSESSIONID=186a5
_=/usr/bin/env
bash-3.2$

@jchauncey your env looks fine, have you tried to run there fleetctl list-machines?

-error: couldn't read version from server: Get http://172.17.15.101:8080/api: dial tcp 172.17.15.101:8080: connection refused

\^C
╭─jonathanchauncey at ENG000637 in ~ using ‹2.2.2›
╰─○ flu
Error retrieving list of units from repository: Get http://domain-sock/fleet/v1/state?alt=json: dial unix /var/run/fleet.sock: no such file or directory
╭─jonathanchauncey at ENG000637 in ~ using ‹2.2.2›
╰─○ flm
Error retrieving list of active machines: Get http://domain-sock/fleet/v1/machines?alt=json: dial unix /var/run/fleet.sock: no such file or directory

flm and flu are just functions for fleet list-units and fleetctl list-machines

very weird, I have tried to replicate your env except on having zsh worked fine to me, so when you do OS shell can you do:

export ETCD_HOST=172.17.15.101
export ETCD_PORT=2379

as the rest of fleet/k8s master API looks fine, and then do again fleetctl list-machines
thanks

Last login: Fri Nov 13 15:13:47 2015 from 10.0.2.2
CoreOS stable (647.2.0)
Failed Units: 1
  user-cloudinit@var-lib-coreos\x2dvagrant-vagrantfile\x2duser\x2ddata.service
core@k8smaster-01 ~ $ fleetctl list-machines
2015/11/13 15:30:06 INFO client.go:291: Failed getting response from http://127.0.0.1:4001/: dial tcp 127.0.0.1:4001: connection refused
2015/11/13 15:30:06 ERROR client.go:213: Unable to get result for {Get /_coreos.com/fleet/machines}, retrying in 100ms
2015/11/13 15:30:06 INFO client.go:291: Failed getting response from http://127.0.0.1:4001/: dial tcp 127.0.0.1:4001: connection refused
2015/11/13 15:30:06 ERROR client.go:213: Unable to get result for {Get /_coreos.com/fleet/machines}, retrying in 200ms
2015/11/13 15:30:06 INFO client.go:291: Failed getting response from http://127.0.0.1:4001/: dial tcp 127.0.0.1:4001: connection refused
2015/11/13 15:30:06 ERROR client.go:213: Unable to get result for {Get /_coreos.com/fleet/machines}, retrying in 400ms
2015/11/13 15:30:07 INFO client.go:291: Failed getting response from http://127.0.0.1:4001/: dial tcp 127.0.0.1:4001: connection refused
2015/11/13 15:30:07 ERROR client.go:213: Unable to get result for {Get /_coreos.com/fleet/machines}, retrying in 800ms
^Ccore@k8smaster-01 ~ $ fleetctl list-units
2015/11/13 15:30:12 INFO client.go:291: Failed getting response from http://127.0.0.1:4001/: dial tcp 127.0.0.1:4001: connection refused
2015/11/13 15:30:12 ERROR client.go:213: Unable to get result for {Get /_coreos.com/fleet/machines}, retrying in 100ms
2015/11/13 15:30:13 INFO client.go:291: Failed getting response from http://127.0.0.1:4001/: dial tcp 127.0.0.1:4001: connection refused
2015/11/13 15:30:13 ERROR client.go:213: Unable to get result for {Get /_coreos.com/fleet/machines}, retrying in 200ms
2015/11/13 15:30:13 INFO client.go:291: Failed getting response from http://127.0.0.1:4001/: dial tcp 127.0.0.1:4001: connection refused
2015/11/13 15:30:13 ERROR client.go:213: Unable to get result for {Get /_coreos.com/fleet/machines}, retrying in 400ms
2015/11/13 15:30:13 INFO client.go:291: Failed getting response from http://127.0.0.1:4001/: dial tcp 127.0.0.1:4001: connection refused
2015/11/13 15:30:13 ERROR client.go:213: Unable to get result for {Get /_coreos.com/fleet/machines}, retrying in 800ms
^Ccore@k8smaster-01 ~ $ env
MANPATH=/usr/local/share/man:/usr/share/man
TERM=xterm-256color
SHELL=/bin/bash
SSH_CLIENT=10.0.2.2 53327 22
SSH_TTY=/dev/pts/0
USER=core
SSH_AUTH_SOCK=/tmp/ssh-c3jyZdIdIx/agent.952
PAGER=/usr/bin/less
MAIL=/var/mail/core
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/bin
PWD=/home/core
EDITOR=/usr/bin/vim
LESSCHARSET=utf-8
SHLVL=1
HOME=/home/core
LESS=-R -M --shift 5
LOGNAME=core
SSH_CONNECTION=10.0.2.2 53327 10.0.2.15 22
LESSOPEN=|lesspipe %s
INFOPATH=/usr/share/info
CONFIG_PROTECT=/usr/share/gnupg/qualified.txt
_=/usr/bin/env
core@k8smaster-01 ~ $

running update command see if it fixes this. im not getting much time to debug the issue but would like to get this working

oh man you are using very old coreos release there, that could be the issue, try with most recent one

ok so i switched to beta chan instead of stable and that seems to have fixed it.

ok, cool, good to know that the issue was too old coreos release version