cannot upload docker container to registry
DavidSie opened this issue ยท 61 comments
When I build an up with buildpack it works, but when I want to build container I cannot upload it to the registry
kubectl version
Client Version: version.Info{Major:"1", Minor:"3", GitVersion:"v1.3.4", GitCommit:"dd6b458ef8dbf24aff55795baa68f83383c9b3a9", GitTreeState:"clean", BuildDate:"2016-08-01T16:45:16Z", GoVersion:"go1.6.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"3", GitVersion:"v1.3.4+coreos.0", GitCommit:"be9bf3e842a90537e48361aded2872e389e902e7", GitTreeState:"clean", BuildDate:"2016-08-02T00:54:53Z", GoVersion:"go1.6.2", Compiler:"gc", Platform:"linux/amd64"}
deis version
v2.4.0
git push deis master
Counting objects: 589, done.
Compressing objects: 100% (416/416), done.
Writing objects: 100% (589/589), 2.47 MiB, done.
Total 589 (delta 46), reused 581 (delta 42)
Starting build... but first, coffee!
Step 1 : FROM ruby:2.0.0-p576
---> a137b6df82e8
Step 2 : COPY . /app
---> Using cache
---> a7107ea0f79a
Step 3 : WORKDIR /app
---> Using cache
---> ba2d0c3222ec
Step 4 : EXPOSE 3000
---> Using cache
---> 18f7fb188ed3
Step 5 : CMD while true; do echo hello world; sleep 1; done
---> Using cache
---> 4e22b0487484
Successfully built 4e22b0487484
Pushing to registry
{"errorDetail":{"message":"Put http://localhost:5555/v1/repositories/spree/: dial tcp 127.0.0.1:5555: getsockopt: connection refused"},"error":"Put http://localhost:5555/v1/repositories/spree/: dial tcp 127.0.0.remote: getsockopt: connection refused"}
I know that there are environmental variables to point this address:
Environment Variables:
DEIS_REGISTRY_SERVICE_HOST: localhost
DEIS_REGISTRY_SERVICE_PORT: 5555
but I don't understand why, since none of the pods, and none of the services is listening on 5555
services
kubectl get services --namespace=deis
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
deis-builder 10.3.0.233 <none> 2222/TCP 1d
deis-controller 10.3.0.23 <none> 80/TCP 1d
deis-database 10.3.0.253 <none> 5432/TCP 1d
deis-logger 10.3.0.221 <none> 80/TCP 1d
deis-logger-redis 10.3.0.148 <none> 6379/TCP 1d
deis-minio 10.3.0.232 <none> 9000/TCP 1d
deis-monitor-grafana 10.3.0.113 <none> 80/TCP 1d
deis-monitor-influxapi 10.3.0.234 <none> 80/TCP 1d
deis-monitor-influxui 10.3.0.141 <none> 80/TCP 1d
deis-nsqd 10.3.0.82 <none> 4151/TCP,4150/TCP 1d
deis-registry 10.3.0.188 <none> 80/TCP 1d
deis-router 10.3.0.133 <pending> 80/TCP,443/TCP,2222/TCP,9090/TCP 1d
deis-workflow-manager 10.3.0.34 <none> 80/TCP 1d
pods
kubectl describe pods deis-registry-3758253254-3gtjo --namespace=deis
Name: deis-registry-3758253254-3gtjo
Namespace: deis
Node: 10.63.11.75/10.63.11.75
Start Time: Mon, 22 Aug 2016 10:36:12 +0000
Labels: app=deis-registry
pod-template-hash=3758253254
Status: Running
IP: 10.2.12.12
Controllers: ReplicaSet/deis-registry-3758253254
Containers:
deis-registry:
Container ID: docker://78d6d569eefac3766e4b921f21b7847d36866a266ae76424d7d6e572bb2f5979
Image: quay.io/deis/registry:v2.2.0
Image ID: docker://sha256:0eb83b180d1aa993fcdd715e4b919b4867051d4f35a813a56eec04ae0705d3d1
Port: 5000/TCP
State: Running
Started: Mon, 22 Aug 2016 10:43:05 +0000
Ready: True
Restart Count: 0
Liveness: http-get http://:5000/v2/ delay=1s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get http://:5000/v2/ delay=1s timeout=1s period=10s #success=1 #failure=3
Environment Variables:
REGISTRY_STORAGE_DELETE_ENABLED: true
REGISTRY_LOG_LEVEL: info
REGISTRY_STORAGE: minio
Conditions:
Type Status
Initialized True
Ready True
PodScheduled True
Volumes:
registry-storage:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
registry-creds:
Type: Secret (a volume populated by a Secret)
SecretName: objectstorage-keyfile
deis-registry-token-inyyj:
Type: Secret (a volume populated by a Secret)
SecretName: deis-registry-token-inyyj
QoS Tier: BestEffort
No events.
kubectl describe pods deis-registry-proxy-cpu68 --namespace=deis
Name: deis-registry-proxy-cpu68
Namespace: deis
Node: 10.63.11.76/10.63.11.76
Start Time: Mon, 22 Aug 2016 10:36:31 +0000
Labels: app=deis-registry-proxy
heritage=deis
Status: Running
IP: 10.2.63.4
Controllers: DaemonSet/deis-registry-proxy
Containers:
deis-registry-proxy:
Container ID: docker://dc29ab400a06ae5dc1407c7f1fb0880d4257720170eded6a7f8cde5431fa9570
Image: quay.io/deis/registry-proxy:v1.0.0
Image ID: docker://sha256:fde297ec95aa244e5be48f438de39a13dae16a1593b3792d8c10cd1d7011f8d1
Port: 80/TCP
Limits:
cpu: 100m
memory: 50Mi
Requests:
cpu: 100m
memory: 50Mi
State: Running
Started: Mon, 22 Aug 2016 10:38:32 +0000
Ready: True
Restart Count: 0
Environment Variables:
REGISTRY_HOST: $(DEIS_REGISTRY_SERVICE_HOST)
REGISTRY_PORT: $(DEIS_REGISTRY_SERVICE_PORT)
Conditions:
Type Status
Initialized True
Ready True
PodScheduled True
Volumes:
default-token-tk993:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-tk993
QoS Tier: Guaranteed
No events.
From the pod list it looks like the registry-proxy component is missing, which is what proxies requests to the registry. Can you confirm with kubectl --namespace=deis get daemonsets
?
there are registry proxies. I attached one above but there are 3 the same( I using 1 master + 2 minions).
kubectl --namespace=deis get daemonsets
NAME DESIRED CURRENT NODE-SELECTOR AGE
deis-logger-fluentd 3 3 <none> 1d
deis-monitor-telegraf 3 3 <none> 1d
deis-registry-proxy 3 3 <none> 1d
Okay, so if you do indeed have registry proxies then you're probably hitting the same issue as #62, since your app relies on the ruby
image which is relatively large. I would take a look into that issue and see if you find similar behaviour.
According to docker hub https://hub.docker.com/r/library/ruby/tags/ it's only 313 MB, I would say that's average.
Are you sure that this address make sense: localhost:5555
, since deis-registry 10.3.0.188 <none> 80/TCP
and deis-registry-3758253254-3gtjo
pod is listening on port 5000
?
Yes, that address is correct. The request goes through the registry-proxy, which (as the name suggests) proxies the request to the real registry. It's a workaround for the --insecure-registry
flag. See https://github.com/deis/registry-proxy#about
Coming back to the original problem, I'd inspect both your registry and minio to ensure that there are no problems with either backend. From reports it seems like slightly larger than normal images built via Dockerfile (>100MB) seem to be causing these issues.
this is not a big container size issue (alpine is 2MB: https://hub.docker.com/r/library/alpine/tags/ ) :
git push deis master
Counting objects: 48, done.
Compressing objects: 100% (47/47), done.
Writing objects: 100% (48/48), 6.35 KiB, done.
Total 48 (delta 14), reused 0 (delta 0)
Starting build... but first, coffee!
...
Step 1 : FROM alpine
---> 4e38e38c8ce0
Step 2 : ENV GOPATH /go
---> Using cache
---> bd4d962b7a6e
Step 3 : ENV GOROOT /usr/local/go
---> Using cache
---> 346b304d9d9d
Step 4 : ENV PATH $PATH:/usr/local/go/bin:/go/bin
---> Using cache
---> bfd14db2b7e7
Step 5 : EXPOSE 80
---> Using cache
---> a019f2dadbcc
Step 6 : ENTRYPOINT while true; do echo hello world; sleep 1; done
---> Using cache
---> d500b7d348cb
Successfully built d500b7d348cb
Pushing to registry
{"errorDetail":{"message":"Put http://localhost:5555/v1/repositories/gaslit-gladness/: dial tcp 127.0.0.1:5555: getsockopt: connection refused"},"error":"Put http://localhost:5555/v1/repositories/gaslit-gladnessremote: tcp 127.0.0.1:5555: getsockopt: connection refused"}
remote: 2016/08/25 07:18:46 Error running git receive hook [Build pod exited with code 1, stopping build.]
To ssh://git@deis-builder.10.63.11.83.nip.io:2222/gaslit-gladness.git
! [remote rejected] master -> master (pre-receive hook declined)
error: failed to push some refs to 'ssh://git@deis-builder.10.63.11.83.nip.io:2222/gaslit-gladness.git'
Which container should listen on port 5555
?
(This is a different cluster but from the same script)
Which container should listen on port 5555 ?
The registry-proxy listens on port 5555.
Can you please provide the following information so we can try to reproduce this?
kubectl version
- how you provisioned your kubernetes cluster
I recall that there is internal networking issues when using CoreOS with calico: deis/workflow#442
from inside the container
root@deis-registry-proxy-jzf3h:/# telnet localhost 5555
Trying ::1...
Trying 127.0.0.1...
telnet: Unable to connect to remote host: Connection refused
root@deis-registry-proxy-jzf3h:/# netstat -lntpu
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN 1/nginx: master pro
kubectl version:
kubectl version
Client Version: version.Info{Major:"1", Minor:"3", GitVersion:"v1.3.4", GitCommit:"dd6b458ef8dbf24aff55795baa68f83383c9b3a9", GitTreeState:"clean", BuildDate:"2016-08-01T16:45:16Z", GoVersion:"go1.6.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"3", GitVersion:"v1.3.4+coreos.0", GitCommit:"be9bf3e842a90537e48361aded2872e389e902e7", GitTreeState:"clean", BuildDate:"2016-08-02T00:54:53Z", GoVersion:"go1.6.2", Compiler:"gc", Platform:"linux/amd64"}
to provision kubernetes cluster I used this tutorial: https://coreos.com/kubernetes/docs/latest/getting-started.html
The fun situation about not being able to connect to localhost:5555 within the container is to be expected. We actually mount the host's docker socket, so any command we perform it assumes the host's network. Therefore, localhost:5555 on the host belongs to registry-proxy.
When you provisioned kubernetes, where did you deploy your cluster? AWS, GKE, Vagrant?
I did it on Openstack
We had exact same problem, with:
coreos-kubernetes (from github repo #1876aac with kubernetes 1.3.4)
deis 2.4.0
vagrant 1.8.5
To create our Kubernetes cluster we followed the tutorial here: https://coreos.com/kubernetes/docs/latest/kubernetes-on-vagrant.html
After quite a bit of struggling (turning off calico, changing hostPort from 5555 to 80, etc. - nothing changed) we resolved using the plain version of Kubernetes, from the main Deis tutorial here: https://deis.com/docs/workflow/quickstart/provider/vagrant/boot/
with the notable change of Vagrant version, downgrading to 1.8.3, since the 1.8.5 has this bug: hashicorp/vagrant#5186 (it's marked as closed but there's a regression in 1.8.5).
So, for us, the problem was in the CoreOs package. We haven't tried the very last commit though.
EDIT: we also tried the last commit from the CoreOs repository (commit #bdfe006) with Deis 2.4.1, nothing changed.
@think01 So you think that kubelet-wrapper
provided with CoreOS may be a cause of this problem, right ?
@DavidSie well, I cannot say the problem is in that component, but we solved by avoiding to use the coreos-kubernetes package and going plain with kubernetes on vagrant (that creates some fedora bosex).
Why you talk about kubelet-wrapper
?
Because I saw that CoreOS is shipped with this script /usr/lib/coreos/kubelet-wrapper
but there I see it only starts hyperkube on rkt.
ping @DavidSie, were you able to identify the root cause of your issue here?
I am experiencing what I think is a similar issue. My image is 385.9M (so it's >100M as mentioned by @bacongobbler). Regarding "inspecting" the backend - I cannot figure out how to get helpful logging out of the minio pod. I've tried the --debug
switch in various permutations, then found minio/minio#820 which seems to indicate that it's no longer valid because it's not needed. I've tried setting MINIO_TRACE=1
per some code fragments I found. However, the kubectl --namespace logs deis-minio-123xyz
only ever shows what I assume is the minio startup stuff - there's no debug log, no trace log, nothing to indicate the behavior of minio during operation.
The first time:
deis pull
2016-09-21 08:28:43
rbellamy@eanna i ~/Development/Terradatum/aergo/aergo-server feature/docker % deis pull 192.168.57.10:5000/aergo-server:1.0.0-SNAPSHOT -a aergo-server
Creating build... Error: Unknown Error (400): {"detail":"dial tcp 10.11.28.91:9000: i/o timeout"}
zsh: exit 1 deis pull 192.168.57.10:5000/aergo-server:1.0.0-SNAPSHOT -a aergo-server
controller logs
INFO [aergo-server]: build aergo-server-11b3c2a created
INFO [aergo-server]: rbellamy deployed 192.168.57.10:5000/aergo-server:1.0.0-SNAPSHOT
INFO Pulling Docker image 192.168.57.10:5000/aergo-server:1.0.0-SNAPSHOT
INFO Tagging Docker image 192.168.57.10:5000/aergo-server:1.0.0-SNAPSHOT as localhost:5555/aergo-server:v2
INFO Pushing Docker image localhost:5555/aergo-server:v2
INFO Pushing Docker image localhost:5555/aergo-server:v2
INFO Pushing Docker image localhost:5555/aergo-server:v2
INFO [aergo-server]: dial tcp 10.11.28.91:9000: i/o timeout
ERROR:root:dial tcp 10.11.28.91:9000: i/o timeout
Traceback (most recent call last):
File "/app/api/models/release.py", line 88, in new
release.publish()
File "/app/api/models/release.py", line 135, in publish
publish_release(source_image, self.image, deis_registry, self.get_registry_auth())
File "/app/registry/dockerclient.py", line 199, in publish_release
return DockerClient().publish_release(source, target, deis_registry, creds)
File "/app/registry/dockerclient.py", line 117, in publish_release
self.push("{}/{}".format(self.registry, name), tag)
File "/usr/local/lib/python3.5/dist-packages/backoff.py", line 286, in retry
ret = target(*args, **kwargs)
File "/app/registry/dockerclient.py", line 135, in push
log_output(stream, 'push', repo, tag)
File "/app/registry/dockerclient.py", line 178, in log_output
stream_error(chunk, operation, repo, tag)
File "/app/registry/dockerclient.py", line 195, in stream_error
raise RegistryException(message)
registry.dockerclient.RegistryException: dial tcp 10.11.28.91:9000: i/o timeout
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/app/api/models/build.py", line 62, in create
source_version=self.version
File "/app/api/models/release.py", line 95, in new
raise DeisException(str(e)) from e
api.exceptions.DeisException: dial tcp 10.11.28.91:9000: i/o timeout
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/rest_framework/views.py", line 471, in dispatch
response = handler(request, *args, **kwargs)
File "/app/api/views.py", line 181, in create
return super(AppResourceViewSet, self).create(request, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/rest_framework/mixins.py", line 21, in create
self.perform_create(serializer)
File "/app/api/viewsets.py", line 21, in perform_create
self.post_save(obj)
File "/app/api/views.py", line 258, in post_save
self.release = build.create(self.request.user)
File "/app/api/models/build.py", line 71, in create
raise DeisException(str(e)) from e
api.exceptions.DeisException: dial tcp 10.11.28.91:9000: i/o timeout
10.10.2.8 "POST /v2/apps/aergo-server/builds/ HTTP/1.1" 400 51 "Deis Client v2.5.1"
Then immediately, I try again:
deis pull
2016-09-21 08:42:27
rbellamy@eanna i ~/Development/Terradatum/aergo/aergo-server feature/docker % deis pull 192.168.57.10:5000/aergo-server:1.0.0-SNAPSHOT -a aergo-server
Creating build... Error: Unknown Error (502): <html>
<head><title>502 Bad Gateway</title></head>
<body bgcolor="white">
<center><h1>502 Bad Gateway</h1></center>
<hr><center>nginx/1.11.2</center>
</body>
</html>
zsh: exit 1 deis pull 192.168.57.10:5000/aergo-server:1.0.0-SNAPSHOT -a aergo-server
controller logs
INFO [aergo-server]: build aergo-server-c09bb9b created
INFO [aergo-server]: rbellamy deployed 192.168.57.10:5000/aergo-server:1.0.0-SNAPSHOT
INFO Pulling Docker image 192.168.57.10:5000/aergo-server:1.0.0-SNAPSHOT
INFO Tagging Docker image 192.168.57.10:5000/aergo-server:1.0.0-SNAPSHOT as localhost:5555/aergo-server:v4
INFO Pushing Docker image localhost:5555/aergo-server:v4
INFO Pushing Docker image localhost:5555/aergo-server:v4
10.10.2.8 "GET /v2/apps/aergo-server/logs HTTP/1.1" 200 1284 "Deis Client v2.5.1"
INFO Pushing Docker image localhost:5555/aergo-server:v4
[2016-09-21 16:05:50 +0000] [24] [CRITICAL] WORKER TIMEOUT (pid:37)
[2016-09-21 16:05:50 +0000] [37] [WARNING] worker aborted
File "/usr/local/bin/gunicorn", line 11, in <module>
sys.exit(run())
File "/usr/local/lib/python3.5/dist-packages/gunicorn/app/wsgiapp.py", line 74, in run
WSGIApplication("%(prog)s [OPTIONS] [APP_MODULE]").run()
File "/usr/local/lib/python3.5/dist-packages/gunicorn/app/base.py", line 192, in run
super(Application, self).run()
File "/usr/local/lib/python3.5/dist-packages/gunicorn/app/base.py", line 72, in run
Arbiter(self).run()
File "/usr/local/lib/python3.5/dist-packages/gunicorn/arbiter.py", line 189, in run
self.manage_workers()
File "/usr/local/lib/python3.5/dist-packages/gunicorn/arbiter.py", line 524, in manage_workers
self.spawn_workers()
File "/usr/local/lib/python3.5/dist-packages/gunicorn/arbiter.py", line 590, in spawn_workers
self.spawn_worker()
File "/usr/local/lib/python3.5/dist-packages/gunicorn/arbiter.py", line 557, in spawn_worker
worker.init_process()
File "/usr/local/lib/python3.5/dist-packages/gunicorn/workers/base.py", line 132, in init_process
self.run()
File "/usr/local/lib/python3.5/dist-packages/gunicorn/workers/sync.py", line 124, in run
self.run_for_one(timeout)
File "/usr/local/lib/python3.5/dist-packages/gunicorn/workers/sync.py", line 68, in run_for_one
self.accept(listener)
File "/usr/local/lib/python3.5/dist-packages/gunicorn/workers/sync.py", line 30, in accept
self.handle(listener, client, addr)
File "/usr/local/lib/python3.5/dist-packages/gunicorn/workers/sync.py", line 135, in handle
self.handle_request(listener, req, client, addr)
File "/usr/local/lib/python3.5/dist-packages/gunicorn/workers/sync.py", line 176, in handle_request
respiter = self.wsgi(environ, resp.start_response)
File "/usr/local/lib/python3.5/dist-packages/django/core/handlers/wsgi.py", line 170, in __call__
response = self.get_response(request)
File "/usr/local/lib/python3.5/dist-packages/django/core/handlers/base.py", line 124, in get_response
response = self._middleware_chain(request)
File "/usr/local/lib/python3.5/dist-packages/django/core/handlers/exception.py", line 39, in inner
response = get_response(request)
File "/usr/local/lib/python3.5/dist-packages/django/utils/deprecation.py", line 133, in __call__
response = self.get_response(request)
File "/usr/local/lib/python3.5/dist-packages/django/core/handlers/exception.py", line 39, in inner
response = get_response(request)
File "/usr/local/lib/python3.5/dist-packages/django/utils/deprecation.py", line 133, in __call__
response = self.get_response(request)
File "/usr/local/lib/python3.5/dist-packages/django/core/handlers/exception.py", line 39, in inner
response = get_response(request)
File "/usr/local/lib/python3.5/dist-packages/django/utils/deprecation.py", line 133, in __call__
response = self.get_response(request)
File "/usr/local/lib/python3.5/dist-packages/django/core/handlers/exception.py", line 39, in inner
response = get_response(request)
File "/usr/local/lib/python3.5/dist-packages/django/utils/deprecation.py", line 133, in __call__
response = self.get_response(request)
File "/usr/local/lib/python3.5/dist-packages/django/core/handlers/exception.py", line 39, in inner
response = get_response(request)
File "/usr/local/lib/python3.5/dist-packages/django/utils/deprecation.py", line 133, in __call__
response = self.get_response(request)
File "/usr/local/lib/python3.5/dist-packages/django/core/handlers/exception.py", line 39, in inner
response = get_response(request)
File "/usr/local/lib/python3.5/dist-packages/django/utils/deprecation.py", line 133, in __call__
response = self.get_response(request)
File "/usr/local/lib/python3.5/dist-packages/django/core/handlers/exception.py", line 39, in inner
response = get_response(request)
File "/usr/local/lib/python3.5/dist-packages/django/utils/deprecation.py", line 133, in __call__
response = self.get_response(request)
File "/usr/local/lib/python3.5/dist-packages/django/core/handlers/exception.py", line 39, in inner
response = get_response(request)
File "/usr/local/lib/python3.5/dist-packages/django/utils/deprecation.py", line 133, in __call__
response = self.get_response(request)
File "/usr/local/lib/python3.5/dist-packages/django/core/handlers/exception.py", line 39, in inner
response = get_response(request)
File "/app/api/middleware.py", line 22, in __call__
response = self.get_response(request)
File "/usr/local/lib/python3.5/dist-packages/django/core/handlers/exception.py", line 39, in inner
response = get_response(request)
File "/usr/local/lib/python3.5/dist-packages/django/core/handlers/base.py", line 185, in _get_response
response = wrapped_callback(request, *callback_args, **callback_kwargs)
File "/usr/local/lib/python3.5/dist-packages/django/views/decorators/csrf.py", line 58, in wrapped_view
return view_func(*args, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/rest_framework/viewsets.py", line 87, in view
return self.dispatch(request, *args, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/rest_framework/views.py", line 471, in dispatch
response = handler(request, *args, **kwargs)
File "/app/api/views.py", line 181, in create
return super(AppResourceViewSet, self).create(request, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/rest_framework/mixins.py", line 21, in create
self.perform_create(serializer)
File "/app/api/viewsets.py", line 21, in perform_create
self.post_save(obj)
File "/app/api/views.py", line 258, in post_save
self.release = build.create(self.request.user)
File "/app/api/models/build.py", line 62, in create
source_version=self.version
File "/app/api/models/release.py", line 88, in new
release.publish()
File "/app/api/models/release.py", line 135, in publish
publish_release(source_image, self.image, deis_registry, self.get_registry_auth())
File "/app/registry/dockerclient.py", line 199, in publish_release
return DockerClient().publish_release(source, target, deis_registry, creds)
File "/app/registry/dockerclient.py", line 117, in publish_release
self.push("{}/{}".format(self.registry, name), tag)
File "/usr/local/lib/python3.5/dist-packages/backoff.py", line 286, in retry
ret = target(*args, **kwargs)
File "/app/registry/dockerclient.py", line 135, in push
log_output(stream, 'push', repo, tag)
File "/app/registry/dockerclient.py", line 175, in log_output
for chunk in stream:
File "/usr/local/lib/python3.5/dist-packages/docker/client.py", line 245, in _stream_helper
data = reader.read(1)
File "/usr/local/lib/python3.5/dist-packages/requests/packages/urllib3/response.py", line 314, in read
data = self._fp.read(amt)
File "/usr/lib/python3.5/http/client.py", line 448, in read
n = self.readinto(b)
File "/usr/lib/python3.5/http/client.py", line 478, in readinto
return self._readinto_chunked(b)
File "/usr/lib/python3.5/http/client.py", line 573, in _readinto_chunked
chunk_left = self._get_chunk_left()
File "/usr/lib/python3.5/http/client.py", line 541, in _get_chunk_left
chunk_left = self._read_next_chunk_size()
File "/usr/lib/python3.5/http/client.py", line 501, in _read_next_chunk_size
line = self.fp.readline(_MAXLINE + 1)
File "/usr/lib/python3.5/socket.py", line 575, in readinto
return self._sock.recv_into(b)
File "/usr/local/lib/python3.5/dist-packages/gunicorn/workers/base.py", line 191, in handle_abort
self.cfg.worker_abort(self)
File "/app/deis/gunicorn/config.py", line 36, in worker_abort
traceback.print_stack()
@rbellamy can you post registry logs in a gist? That will likely give us more information why the registry is failing to communicate with minio.
@bacongobbler will do.
Also, may be related to minio/minio#2743.
Here's my setup, using Alpha channel of CoreOS and libvirt:
export KUBERNETES_PROVIDER=libvirt-coreos && export NUM_NODES=4
./cluster/kube-up.sh
# wait for etcd to settle
helmc install workflow-v2.5.0
# wait for kubernetes cluster to all be ready
deis pull 192.168.57.10:5000/aergo-server:1.0.0-SNAPSHOT -a aergo-server
Worked with @harshavardhana from the minio crew to try to troubleshoot this.
For whatever reason, during our teleconsole session, I was able to successfully push the image into the deis-registry-proxy
- but then saw the same dial i/o timeout
but in a different context. This time, it was while pulling the image from the proxy, during the image app:deploy
phase.
NOTE: you can ignore the 404 below - v4 of the aergo-server
doesn't exist since I've restarted the minio pod several times during troubleshooting. The v5 release is definitely stored in minio, as can be seen in the mc ls
command at the bottom of this post.
INFO [aergo-server]: build aergo-server-49c7405 created
INFO [aergo-server]: rbellamy deployed 192.168.57.10:5000/aergo-server:1.0.0-SNAPSHOT
INFO Pulling Docker image 192.168.57.10:5000/aergo-server:1.0.0-SNAPSHOT
INFO Tagging Docker image 192.168.57.10:5000/aergo-server:1.0.0-SNAPSHOT as localhost:5555/aergo-server:v5
INFO Pushing Docker image localhost:5555/aergo-server:v5
INFO Pulling Docker image localhost:5555/aergo-server:v5
INFO [aergo-server]: adding 5s on to the original 120s timeout to account for the initial delay specified in the liveness / readiness probe
INFO [aergo-server]: This deployments overall timeout is 125s - batch timout is 125s and there are 1 batches to deploy with a total of 1 pods
INFO [aergo-server]: waited 10s and 1 pods are in service
INFO [aergo-server]: waited 20s and 1 pods are in service
INFO [aergo-server]: waited 30s and 1 pods are in service
INFO [aergo-server]: waited 40s and 1 pods are in service
ERROR [aergo-server]: There was a problem deploying v5. Rolling back process types to release v4.
INFO Pulling Docker image localhost:5555/aergo-server:v4
INFO Pulling Docker image localhost:5555/aergo-server:v4
INFO Pulling Docker image localhost:5555/aergo-server:v4
INFO Pulling Docker image localhost:5555/aergo-server:v4
INFO Pulling Docker image localhost:5555/aergo-server:v4
INFO Pulling Docker image localhost:5555/aergo-server:v4
INFO Pulling Docker image localhost:5555/aergo-server:v4
INFO Pulling Docker image localhost:5555/aergo-server:v4
INFO Pulling Docker image localhost:5555/aergo-server:v4
ERROR [aergo-server]: (app::deploy): image aergo-server:v4 not found
ERROR:root:(app::deploy): image aergo-server:v4 not found
Traceback (most recent call last):
File "/app/scheduler/__init__.py", line 168, in deploy
deployment = self.deployment.get(namespace, name).json()
File "/app/scheduler/resources/deployment.py", line 29, in get
raise KubeHTTPException(response, message, *args)
scheduler.exceptions.KubeHTTPException: ('failed to get Deployment "aergo-server-cmd" in Namespace "aergo-server": 404 Not Found', 'aergo-server-cmd', 'aergo-server')
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/app/api/models/app.py", line 578, in deploy
async_run(tasks)
File "/app/api/utils.py", line 169, in async_run
raise error
File "/usr/lib/python3.5/asyncio/tasks.py", line 241, in _step
result = coro.throw(exc)
File "/app/api/utils.py", line 182, in async_task
yield from loop.run_in_executor(None, params)
File "/usr/lib/python3.5/asyncio/futures.py", line 361, in __iter__
yield self # This tells Task to wait for completion.
File "/usr/lib/python3.5/asyncio/tasks.py", line 296, in _wakeup
future.result()
File "/usr/lib/python3.5/asyncio/futures.py", line 274, in result
raise self._exception
File "/usr/lib/python3.5/concurrent/futures/thread.py", line 55, in run
result = self.fn(*self.args, **self.kwargs)
File "/app/scheduler/__init__.py", line 175, in deploy
namespace, name, image, entrypoint, command, **kwargs
File "/app/scheduler/resources/deployment.py", line 123, in create
self.wait_until_ready(namespace, name, **kwargs)
File "/app/scheduler/resources/deployment.py", line 338, in wait_until_ready
additional_timeout = self.pod._handle_pending_pods(namespace, labels)
File "/app/scheduler/resources/pod.py", line 552, in _handle_pending_pods
self._handle_pod_errors(pod, reason, message)
File "/app/scheduler/resources/pod.py", line 491, in _handle_pod_errors
raise KubeException(message)
scheduler.exceptions.KubeException: error pulling image configuration: Get http://10.11.28.91:9000/registry/docker/registry/v2/blobs/sha256/59/5905a7c362fbff9626d517a6ba0d8930fba34a321ba4c7bb718144d80cfaf29b/data?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=8TZRY2JRWMPT6UMXR6I5%2F20160921%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20160921T194800Z&X-Amz-Expires=1200&X-Amz-SignedHeaders=host&X-Amz-Signature=314c92bb84dbd4dd41f9bc572e625201a32ce300394d34e8516a57382fd2ec52: dial tcp 10.11.28.91:9000: i/o timeout
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/app/api/models/release.py", line 168, in get_port
port = docker_get_port(self.image, deis_registry, creds)
File "/app/registry/dockerclient.py", line 203, in get_port
return DockerClient().get_port(target, deis_registry, creds)
File "/app/registry/dockerclient.py", line 79, in get_port
info = self.inspect_image(target)
File "/usr/local/lib/python3.5/dist-packages/backoff.py", line 286, in retry
ret = target(*args, **kwargs)
File "/app/registry/dockerclient.py", line 156, in inspect_image
self.pull(repo, tag=tag)
File "/usr/local/lib/python3.5/dist-packages/backoff.py", line 286, in retry
ret = target(*args, **kwargs)
File "/app/registry/dockerclient.py", line 128, in pull
log_output(stream, 'pull', repo, tag)
File "/app/registry/dockerclient.py", line 178, in log_output
stream_error(chunk, operation, repo, tag)
File "/app/registry/dockerclient.py", line 195, in stream_error
raise RegistryException(message)
registry.dockerclient.RegistryException: image aergo-server:v4 not found
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/app/api/models/app.py", line 585, in deploy
self.deploy(release.previous(), force_deploy=True, rollback_on_failure=False)
File "/app/api/models/app.py", line 526, in deploy
port = release.get_port()
File "/app/api/models/release.py", line 176, in get_port
raise DeisException(str(e)) from e
api.exceptions.DeisException: image aergo-server:v4 not found
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/app/api/models/build.py", line 64, in create
self.app.deploy(new_release)
File "/app/api/models/app.py", line 595, in deploy
raise ServiceUnavailable(err) from e
api.exceptions.ServiceUnavailable: (app::deploy): image aergo-server:v4 not found
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/rest_framework/views.py", line 471, in dispatch
response = handler(request, *args, **kwargs)
File "/app/api/views.py", line 181, in create
return super(AppResourceViewSet, self).create(request, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/rest_framework/mixins.py", line 21, in create
self.perform_create(serializer)
File "/app/api/viewsets.py", line 21, in perform_create
self.post_save(obj)
File "/app/api/views.py", line 258, in post_save
self.release = build.create(self.request.user)
File "/app/api/models/build.py", line 71, in create
raise DeisException(str(e)) from e
api.exceptions.DeisException: (app::deploy): image aergo-server:v4 not found
10.10.2.8 "POST /v2/apps/aergo-server/builds/ HTTP/1.1" 400 59 "Deis Client v2.5.1"
And as you can see, the minio store definitely contains the image, and the proxy can communicate with the minio backend:
root@deis-registry-proxy-ccf4u:~# mc ls myminio/registry -r
[2016-09-21 19:47:36 UTC] 1.5KiB docker/registry/v2/blobs/sha256/2f/2fc6d0a3ec447743456f6fe782622ede8095b662bb39cb10c50b2a795829e51f/data
[2016-09-21 19:46:45 UTC] 112B docker/registry/v2/blobs/sha256/53/5345ff73e9fcf7b6c7d2d7eca2b0338ab274560ff988b8f63e60f73dfe0297ec/data
[2016-09-21 19:47:36 UTC] 5.0KiB docker/registry/v2/blobs/sha256/59/5905a7c362fbff9626d517a6ba0d8930fba34a321ba4c7bb718144d80cfaf29b/data
[2016-09-21 19:46:45 UTC] 232B docker/registry/v2/blobs/sha256/a6/a696cba1f6e865421664a7bf9bf585bcfaa924d56b7d2a112a799e00a7433791/data
[2016-09-21 19:47:14 UTC] 94MiB docker/registry/v2/blobs/sha256/b4/b419440b08d223eabe64f26d5f8556ee8d3f4c0bcafb8dd64ec525cc4eea7f6e/data
[2016-09-21 19:47:19 UTC] 94MiB docker/registry/v2/blobs/sha256/c0/c0963e676944ab20c36e857c33d76a6ba2166aaa6a0d3961d6cf20fae965efd0/data
[2016-09-21 19:47:14 UTC] 47MiB docker/registry/v2/blobs/sha256/d0/d0f0d61cd0d229546b1e33b0c92036ad3f35b42dd2c9a945aeaf67f84684ce26/data
[2016-09-21 19:46:59 UTC] 2.2MiB docker/registry/v2/blobs/sha256/e1/e110a4a1794126ef308a49f2d65785af2f25538f06700721aad8283b81fdfa58/data
[2016-09-21 19:46:45 UTC] 71B docker/registry/v2/repositories/aergo-server/_layers/sha256/5345ff73e9fcf7b6c7d2d7eca2b0338ab274560ff988b8f63e60f73dfe0297ec/link
[2016-09-21 19:47:36 UTC] 71B docker/registry/v2/repositories/aergo-server/_layers/sha256/5905a7c362fbff9626d517a6ba0d8930fba34a321ba4c7bb718144d80cfaf29b/link
[2016-09-21 19:46:45 UTC] 71B docker/registry/v2/repositories/aergo-server/_layers/sha256/a696cba1f6e865421664a7bf9bf585bcfaa924d56b7d2a112a799e00a7433791/link
[2016-09-21 19:47:18 UTC] 71B docker/registry/v2/repositories/aergo-server/_layers/sha256/b419440b08d223eabe64f26d5f8556ee8d3f4c0bcafb8dd64ec525cc4eea7f6e/link
[2016-09-21 19:47:19 UTC] 71B docker/registry/v2/repositories/aergo-server/_layers/sha256/c0963e676944ab20c36e857c33d76a6ba2166aaa6a0d3961d6cf20fae965efd0/link
[2016-09-21 19:47:18 UTC] 71B docker/registry/v2/repositories/aergo-server/_layers/sha256/d0f0d61cd0d229546b1e33b0c92036ad3f35b42dd2c9a945aeaf67f84684ce26/link
[2016-09-21 19:46:59 UTC] 71B docker/registry/v2/repositories/aergo-server/_layers/sha256/e110a4a1794126ef308a49f2d65785af2f25538f06700721aad8283b81fdfa58/link
[2016-09-21 19:47:36 UTC] 71B docker/registry/v2/repositories/aergo-server/_manifests/revisions/sha256/2fc6d0a3ec447743456f6fe782622ede8095b662bb39cb10c50b2a795829e51f/link
[2016-09-21 19:47:36 UTC] 71B docker/registry/v2/repositories/aergo-server/_manifests/tags/v5/current/link
[2016-09-21 19:47:36 UTC] 71B docker/registry/v2/repositories/aergo-server/_manifests/tags/v5/index/sha256/2fc6d0a3ec447743456f6fe782622ede8095b662bb39cb10c50b2a795829e51f/link
@bacongobbler - if you have a setup locally we can work on this and see what is causing the problem. Do not have kubernetes setup locally. i/o timeout seems to be related to network problem between registry and minio server. Need to see if the server itself is not responding properly. Couldn't see it with mc
though.
So, from my registry log gist: https://gist.github.com/rbellamy/c0db447ed47c364ae396b5d0c9852a02#file-deis-issue-64-registry-proxy-logs-L1242
@harshavardhana unfortunately we do not have any clusters reproducing this issue locally nor can we reproduce it ourselves, other than for the calico networking issue.
@rbellamy if you can supply information about how you set up your cluster including your KUBERNETES _PROVIDER envvar when using kube-up.sh and what version of workflow we can try to reproduce there. As far as e2e is concerned we aren't seeing this issue in master or in recent releases. http://ci.deis.io
@bacongobbler I included that information in a comment in this issue: #64 (comment)
Thank you! From what others have voiced earlier it sounds like this sounds related to a CoreOS issue as seen earlier in #64 (comment). I'd recommend trying a different provider first and see if that resolves your issue.
I'm not sure how diagnostic this is, given I'm testing within a single libvirt host - however it should be noted that the host is running 2 x 12 AMD Opteron CPUs on a Supermicro MB with 128G RAM and all SSDs, and each VM is provisioned with 4G and 2CPUs, so I find it hard to believe that the issue at hand is related to overloaded VM host or guest.
From what @bacongobbler has said, deis hasn't seen this in their e2e test runner on k8s. I'd be interested to know what the test matrix looks like WRT other providers/hosts.
Maybe this is a CoreOS-related problem? Given coreos/bugs#1554 it doesn't seem outside the realm of possibility.
Kubernetes on CoreOS (using libvirt-coreos provider and ./kube-up.sh script)
- master with 4 nodes (5 total VMs) does not work (see errors above)
- master with 3 nodes (4 total VMs) does not work (see errors in this comment)
- master with 2 nodes (3 total VMs) works
master with 3 nodes
INFO [aergo-server]: build aergo-server-6972f5f created
INFO [aergo-server]: rbellamy deployed 192.168.57.10:5000/aergo-server:1.0.0-SNAPSHOT
INFO Pulling Docker image 192.168.57.10:5000/aergo-server:1.0.0-SNAPSHOT
INFO Tagging Docker image 192.168.57.10:5000/aergo-server:1.0.0-SNAPSHOT as localhost:5555/aergo-server:v2
INFO Pushing Docker image localhost:5555/aergo-server:v2
INFO Pushing Docker image localhost:5555/aergo-server:v2
INFO Pushing Docker image localhost:5555/aergo-server:v2
INFO [aergo-server]: Put http://localhost:5555/v1/repositories/aergo-server/: read tcp 127.0.0.1:49384->127.0.0.1:5555: read: connection reset by peer
ERROR:root:Put http://localhost:5555/v1/repositories/aergo-server/: read tcp 127.0.0.1:49384->127.0.0.1:5555: read: connection reset by peer
Traceback (most recent call last):
File "/app/api/models/release.py", line 88, in new
release.publish()
File "/app/api/models/release.py", line 135, in publish
publish_release(source_image, self.image, deis_registry, self.get_registry_auth())
File "/app/registry/dockerclient.py", line 199, in publish_release
return DockerClient().publish_release(source, target, deis_registry, creds)
File "/app/registry/dockerclient.py", line 117, in publish_release
self.push("{}/{}".format(self.registry, name), tag)
File "/usr/local/lib/python3.5/dist-packages/backoff.py", line 286, in retry
ret = target(*args, **kwargs)
File "/app/registry/dockerclient.py", line 135, in push
log_output(stream, 'push', repo, tag)
File "/app/registry/dockerclient.py", line 178, in log_output
stream_error(chunk, operation, repo, tag)
File "/app/registry/dockerclient.py", line 195, in stream_error
raise RegistryException(message)
registry.dockerclient.RegistryException: Put http://localhost:5555/v1/repositories/aergo-server/: read tcp 127.0.0.1:49384->127.0.0.1:5555: read: connection reset by peer
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/app/api/models/build.py", line 62, in create
source_version=self.version
File "/app/api/models/release.py", line 95, in new
raise DeisException(str(e)) from e
api.exceptions.DeisException: Put http://localhost:5555/v1/repositories/aergo-server/: read tcp 127.0.0.1:49384->127.0.0.1:5555: read: connection reset by peer
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/rest_framework/views.py", line 471, in dispatch
response = handler(request, *args, **kwargs)
File "/app/api/views.py", line 181, in create
return super(AppResourceViewSet, self).create(request, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/rest_framework/mixins.py", line 21, in create
self.perform_create(serializer)
File "/app/api/viewsets.py", line 21, in perform_create
self.post_save(obj)
File "/app/api/views.py", line 258, in post_save
self.release = build.create(self.request.user)
File "/app/api/models/build.py", line 71, in create
raise DeisException(str(e)) from e
api.exceptions.DeisException: Put http://localhost:5555/v1/repositories/aergo-server/: read tcp 127.0.0.1:49384->127.0.0.1:5555: read: connection reset by peer
10.10.1.5 "POST /v2/apps/aergo-server/builds/ HTTP/1.1" 400 142 "Deis Client v2.5.1"
Maybe this is a CoreOS-related problem? Given coreos/bugs#1554 it doesn't seem outside the realm of possibility.
Yes, I do believe this is a CoreOS related problem as I mentioned in my previous comment. If you can try provisioning a cluster with a different provider that can help narrow down the issue.
@bacongobbler I've used corectl and Kube-Solo with success.
@DavidSie after reading the logs just a little more closely, I realized that this seems to be that it looks like your docker daemon is trying to push to a v1 registry endpoint.
Put http://localhost:5555/v1/repositories/spree/: dial tcp 127.0.0.1:5555: getsockopt: connection refused"
Notice the v1
in there. Since this is directly related to dockerbuilder because buildpack deploys work fine for you, I wonder if it's due to the docker python library auto-detecting the client version: https://github.com/deis/dockerbuilder/blob/28c31d45a17a97473e83c451b0d2e743678620c0/rootfs/deploy.py#L106
@rbellamy can you please re-open a separate issue? Your issue doesn't look to be the same as it looks like the original error from your report is about minio:
error pulling image configuration: Get http://10.11.28.91:9000/registry/docker/registry/v2/blobs/sha256/59/5905a7c362fbff9626d517a6ba0d8930fba34a321ba4c7bb718144d80cfaf29b/data?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=8TZRY2JRWMPT6UMXR6I5%2F20160921%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20160921T194800Z&X-Amz-Expires=1200&X-Amz-SignedHeaders=host&X-Amz-Signature=314c92bb84dbd4dd41f9bc572e625201a32ce300394d34e8516a57382fd2ec52: dial tcp 10.11.28.91:9000: i/o timeout
error pulling image configuration: Get http://10.11.28.91:9000/registry/docker/registry/v2/blobs/sha256/59/5905a7c362fbff9626d517a6ba0d8930fba34a321ba4c7bb718144d80cfaf29b/data?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=8TZRY2JRWMPT6UMXR6I5%2F20160921%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20160921T194800Z&X-Amz-Expires=1200&X-Amz-SignedHeaders=host&X-Amz-Signature=314c92bb84dbd4dd41f9bc572e625201a32ce300394d34e8516a57382fd2ec52: dial tcp 10.11.28.91:9000: i/o timeout
Is this still the network issue we were talking about previously? @bacongobbler - let me know how can i help here.
Yes. @rbellamy believe he has nailed it down as a symptom of coreos/bugs#1554. Thank you for the offer, though!
@bacongobbler
Do you know how can I fix this issue ? Simply update deis (now I use 2.3.0)
I'm not sure how this could be fixed, however using 2.5.0 would never hurt.
I ran into this exact problem when setting up using the CoreOS tool as well. It's too bad that CoreOS aws-cli has this problem b/c the CoreOS tool works really well with cloudformation, which makes teardown a snap after trying out deis. kube-up does not use cloudformation and leaves crap all over your AWS account after you're done with it.
@dblackdblack even after using ./cluster/kube-down.sh
? I've always found that script tears down all the AWS resources it created.
So after debugging with both @jdumars and @felixbuenemann, both clusters seem to be showing the same symptom. The problem? Requesting a hostPort on some providers - like Rancher and CoreOS - does not work. @kmala pointed me towards kubernetes/kubernetes#23920 so it looks like we found our smoking gun.
And for anyone who wants to take a crack at trying a patch, they can run through the following instructions to patch workflow-v2.7.0, removing registry-proxy and making the controller and builder connect directly with the registry. This will require the old --insecure-registry flag to be enabled so the docker daemon can talk to the registry, but here's the commands and the patch on a fresh cluster that shows this symptom:
git clone https://github.com/deis/charts
cd charts
curl https://gist.githubusercontent.com/bacongobbler/0b5f2c4fe6f067ddb775d53d635cc74d/raw/992a95edb8430ebcddba526fb1c48d9d0fcc1166/remove-registry-proxy.patch | git apply -
kubectl delete namespace deis
# also delete any app namespaces so you have a fresh cluster
rm -rf ~/.helmc/workspace/charts/workflow-v2.7.0
cp -R workflow-v2.7.0 ~/.helmc/workspace/charts/
helmc generate workflow-v2.7.0
helmc install workflow-v2.7.0
Note that this will purge your cluster entirely of Workflow.
There is currently no workaround for this as far as I'm aware, but if users want to bring this issue to light they can try to contribute patches upstream to kubernetes! :)
In case anyone wants to patch workflow-dev
you can use this gist with @bacongobbler instruction above.
@zinuzoid the instructions above use that exact patch :)
EDIT: I missed the one line change you made in your patch and the fact it's for workflow-dev. Nice catch!
@bacongobbler plus one line in workflow-dev/tpl/storage.sh
for me to make it work :)
I'm going to close this issue as there is nothing we can do here to work around this issue in Workflow other than with the patch I provided. This is an upstream issue and patches should be applied upstream. Until then please feel free to run with the patch provided here for production deployments that rely on CNI networking. Thanks!
When applying the patch got this corrupt patch at line 6 message:
mbr-31107:charts jwalters$ curl https://gist.githubusercontent.com/bacongobbler/0b5f2c4fe6f067ddb775d53d635cc74d/raw/32a86cc4ddfa0a7cb173b1184ac3e288dedb5a84/remove-registry-proxy.patch | git apply -
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 3080 100 3080 0 0 3557 0 --:--:-- --:--:-- --:--:-- 3556
fatal: corrupt patch at line 6
@jwalters-gpsw try again. I just fixed the patch.
curl https://gist.githubusercontent.com/bacongobbler/0b5f2c4fe6f067ddb775d53d635cc74d/raw/992a95edb8430ebcddba526fb1c48d9d0fcc1166/remove-registry-proxy.patch | git apply -
v2.8.0 patch:
curl https://gist.githubusercontent.com/bacongobbler/0b5f2c4fe6f067ddb775d53d635cc74d/raw/248a052dd0575419d5890abaedec3a7940f3ada6/remove-registry-proxy-v2.8.0.patch | git apply -
Thanks for the updated patch. I'm running coreos on AWS. Is there a way for me to restart the docker daemons with the insecure registry option? Or would I need to redeploy the cluster?
It's easier to re-deploy the cluster if you're just getting set up. Otherwise you'll have to manually SSH into each node, modify the daemon startup flags and reboot docker on every node.
Thanks. I will give that a try. Also thinking about doing a Deis upgrade to the same version per the upgrade instructions but setting the registry to an off-cluster registry.
Manually updated the worker nodes docker config and applied your changes and its working fine now.
Sorry for raising this old thread, but could you please explain how to apply this patch to the 2.9 which is deployed via helm and not helm classic?
You can fetch the chart locally via helm fetch deis/workflow --version=v2.9.1 --untar
, modify the chart with the patch (which you'll have to manually apply since it's not in git), then install it :)
Thank you
Patched with @bacongobbler's suggested fixes #64 (comment) latest helm workflow charts: https://github.com/anubhavmishra/workflow.
Also make sure you are using insecure registry option for Docker suggested here: https://deis.com/docs/workflow/en/v2.2.0/installing-workflow/system-requirements/#docker-insecure-registry
For v2.15.0, the recipe will be:
helm fetch deis/workflow --version=v2.15.0 --untar
cd workflow
curl https://gist.githubusercontent.com/IlyaSemenov/a8f467934cb5f1f0963469cd3eb32ace/raw/b3e8fcb5dd9094b50014177f5db72210b2949883/0001-Remove-proxy.patch|patch -p1
helm upgrade deis .
Don't forget to enable insecure registry /lib/systemd/system/docker.service
at your Docker host(s):
ExecStart=/usr/bin/dockerd -H fd:// --insecure-registry=10.43.0.0/16
Removing the registry proxy should no longer be needed with current versions of the deis helm charts, you can set the following in your deis-workflow values.yml
if you are using CNI:
global:
host_port: 5555
use_cni: true
registry_proxy_bind_addr: "127.0.0.1:5555"
It's not working on Kubernetes 1.5.4 provisioned with Rancher 1.6.2 (latest).
I think this is the related issue rancher/rancher#5857.
I'm new to Deis and I'm encountering all kind of problems in my journey to deploy Deis in AWS.
The last one is when I tried to deploy a Doker image to Deis. For example for a pgadmin4 Docker image when running deis pull ephillipe/pgadmin4
, I'm getting this error:
Creating build... Error: Unknown Error (400): {"detail":"Put http://127.0.0.1:5555/v1/repositories/pgadmin4/: dial tcp 127.0.0.1:5555: getsockopt: connection refused"}
I checked the running daemonsets: kubectl --namespace=deis get daemonsets
and I'm getting:
deis-logger-fluentd 2 2 2 2 2 <none> 6d
deis-monitor-telegraf 2 2 2 2 2 <none> 6d
deis-registry-proxy 0 0 0 0 0 <none> 6d
So clearly the problem is because deis-registry-proxy
is not running.
Can anyone help me with this issue?
How can I start deis-registry-proxy
or if that's not the solution how can I deploy a docker image then?
@IulianParaian I would try the deis slack for troubleshooting. Might be your registry proxies are crashing because the internal registry is unreachable.
@felixbuenemann I did tried the said slack first but didn't get any responses. And I also couldn't find a good documentation or a simple example on how to deploy an app from Docker image/ Docker file. I'm not referring to the official Deis documentation because there are just 3 lines of text with one command line that should work, but obviously it is not.
So maybe some more detailed tutorials with some possible troubleshooting would help.
PS: I raised another issue on Workflow repo regarding an installation using off cluster storage, but no response there either. And for that I also followed the official steps.
And I also couldn't find a good documentation or a simple example on how to deploy an app from Docker image/ Docker file.
I understand your frustration, though if the documentation is lacking, there are example applications provided for nearly any configuration you're looking for in the github org, and we do link to those example applications in the documentation. For example: https://github.com/deis/example-dockerfile-http
Have you taken a look at the troubleshooting documentation? That should help give you a general guideline on how you can self-troubleshoot why your cluster is not working the way it should. If all else fails you can troubleshoot directly using kubectl following kubernetes' documentation.
Hi @bacongobbler, thank you for the answer.
I did troubleshooting my kubernetes and noticed that deis-registry-proxy
component was not running.
This example https://github.com/deis/example-dockerfile-http is one that I tried.
As I am writing this I went to check the deis pods again and surprisingly, I have 2 deis-registry-proxy
instances running. That is strange, I didn't change anything since I posted the issue.
Thanks again.