vmware-archive/admiral

Error adding host to cluster

Closed this issue · 10 comments

I am having issues connecting Phonton 2.0 host to a Admiral cluster using the docker api . I get the error "Error connecting to http://10.202.3.238:2375:"

I also access the api using the http://10.202.3.238:2375/info url from a browser. I would appreciate any help in adding hosts to the cluster

Hello, @tallboi123 and thanks for trying out Admiral. A few things that will help us see a better picture of the problem:

  • How are you running Admiral? Is it the vmware/admiral:latest image from Dockerhub, are you using a build of the master branch or something else (e.g. VIC product)?
  • Is the type of the cluster set to Docker?
  • Is this the first host that you are trying to connect to this cluster, i.e. are you creating a new cluster?
  • Are you able to add another Docker host (e.g. coreos, Docker machine, etc.)?

Also a few things that can help us track the cause of the problem:

  • Can you check the developer console of the browser and provide the error response frop there? You should be looking for a POST to /resources/clusters (or a subpath of this if this was not the first host for the cluster)
  • Can you also provide the relevant part of the Admiral logs?

Hi Shadjiiski
running as a docker container so I ran "docker run -d -p 8282:8282 --name admiral vmware/admiral" running on phonton OS 2.0 host
yes the cluster type is set to docker
yes, its a new cluster
I have not tried to add any other hosts
logs extract --------------------------------------------------
at com.vmware.xenon.common.ServiceHost.lambda$queueOrScheduleRequestInternal$44(ServiceHost.java:4292)
at java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
Caused by: java.lang.IllegalStateException: Configuration property harbor.tab.url not provided
... 10 more
]
[944][W][2018-02-21T09:37:06.287Z][7233][8282/auth/session][lambda$handleGet$1][Failed to retrieve session for current user: java.util.concurrent.CompletionException: java.lang.IllegalArgumentException: Provide either criteria or principalId to search for.
at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:593)
at java.util.concurrent.CompletableFuture.uniApplyStage(CompletableFuture.java:614)
at java.util.concurrent.CompletableFuture.thenApply(CompletableFuture.java:1983)
at com.vmware.xenon.common.DeferredResult.thenApply(DeferredResult.java:164)
at com.vmware.xenon.common.ServiceRequestSender.sendWithDeferredResult(ServiceRequestSender.java:34)
at com.vmware.admiral.auth.util.PrincipalUtil.getPrincipal(PrincipalUtil.java:125)
at com.vmware.admiral.auth.util.SecurityContextUtil.getSecurityContext(SecurityContextUtil.java:52)
at com.vmware.admiral.auth.util.SecurityContextUtil.getSecurityContext(SecurityContextUtil.java:41)
at com.vmware.admiral.auth.idm.SessionService.handleGet(SessionService.java:45)
at com.vmware.xenon.common.StatelessService.handleRequest(StatelessService.java:126)
at com.vmware.xenon.common.StatelessService.handleRequest(StatelessService.java:103)
at com.vmware.xenon.common.ServiceHost.lambda$queueOrScheduleRequestInternal$44(ServiceHost.java:4292)
at java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
Caused by: java.lang.IllegalArgumentException: Provide either criteria or principalId to search for.
at com.vmware.admiral.auth.idm.PrincipalService.handleGet(PrincipalService.java:118)
... 8 more
]
[945][I][2018-02-21T09:38:22.784Z][7241][8282][lambda$triggerForResourcePool$0][Started enumeration task for /resources/pools/d56fd2119728ac75565b5b02d0278]
[946][I][2018-02-21T09:38:22.790Z][7244][8282][lambda$triggerForResourcePool$0][Started capacity update task for /resources/pools/d56fd2119728ac75565b5b02d0278]
[947][I][2018-02-21T09:38:22.822Z][30][8282/resources/hosts][fetchSslTrustAliasProperty][Using non secure channel, skipping SSL validation for http://10.202.3.238:4243/v1.24]
[948][I][2018-02-21T09:38:22.822Z][30][EndpointCertificateUtil][validateSslTrust][Using non secure channel, skipping SSL validation for http://10.202.3.238:4243/v1.24]
[949][W][2018-02-21T09:38:22.827Z][18][8282/resources/hosts][lambda$sendAdapterRequest$25][Error sending adapter request with type Host.Container.Ping : Error connecting to http://10.202.3.238:4243: ]
[950][W][2018-02-21T09:38:22.829Z][7245][8282][startQueryTask][No result limit set on the query: {"taskInfo":{"isDirect":true},"querySpec":{"query":{"occurance":"MUST_OCCUR","booleanClauses":[{"occurance":"MUST_OCCUR","term":{"propertyName":"documentKind","matchValue":"com:vmware:photon:controller:model:resources:ComputeService:ComputeState","matchType":"TERM"}},{"occurance":"MUST_OCCUR","term":{"propertyName":"resourcePoolLink","matchValue":"/resources/pools/d56fd2119728ac75565b5b02d0278","matchType":"TERM"}}]},"options":["COUNT"]},"indexLink":"/core/document-index","nodeSelectorLink":"/core/node-selectors/default","documentVersion":0,"documentUpdateTimeMicros":0,"documentExpirationTimeMicros":0}. Defaulting to 10000]
[951][I][2018-02-21T09:38:22.834Z][7243][8282/resources/pools/d56fd2119728ac75565b5b02d0278][handleDelete][Deleting ResourcePool, Path: /resources/pools/d56fd2119728ac75565b5b02d0278, Operation ID: 1467396, Referrer: http://172.17.0.2:8282/resources/clusters]

Again Thanks for the help

Hello, @tallboi123 thanks for the details. I checked the error messages you posted. The only suspicious thing appears to be this:

[949][W][2018-02-21T09:38:22.827Z][18][8282/resources/hosts][lambda$sendAdapterRequest$25][Error sending adapter request with type Host.Container.Ping : Error connecting to http://10.202.3.238:4243: ]

Sadly, the error here is missing as well. I am currently pushing a change through the pipeline that will provide some additional information. After I push it, it will take some time for the change to reach Dockerhub as well. For now, I see that Admiral is trying to ping 10.202.3.238:4243. From your first post I got the impression that Docker is available on port 2375. Is there a chance that you entered the wrong port in Admiral?

I have also tryied to reproduce the issue with no success. I have done the following steps:

  • Installed OVA with virtual hardware v11 (as listed in the Photon OS 2.0 GA downloads) on vCenter 6 (ESX 6 connected)
  • Enabled access to the Docker Remote API on port 2375 and on /var/run/docker.sock by following these instructions
  • allowed connections to port 2375 in iptables: iptables -A INPUT -p tcp --dport 2375 -j ACCEPT
  • started the docker service for the first time (if it was already running, a restart is needed): systemctl start docker
  • run admiral in container: docker run -d -itp 8282:8282 --name admiral vmware/admiral
  • verified that in the browser I can access http://<photon-ip>:2375/info and that the expected response is returned
  • Created a new cluster in Admiral. The cluster is of type Docker and the URL is http://<photon-ip>:2375. No credentials.
  • The cluster is created successfully. I can even provision containers in the same project.

Is this issue reproducible for you on a clean environment? If so, please provide the steps that you are following in more details. Also, if possible, provide the full logs of the admiral container. You can store them in a file with this command docker logs admiral > admiral.log.

Hi I could not reproduce it on a clean environment. Thanks so much for the help

I have hit the same issue by using wrong credentials for the Docker host. In my case, this can be guessed by looking through the logs (attached minimal logs: logs.txt). Reopening to track the change that is going to provide some additional details in the error message. This is still in the pipeline due to some recent changes that caused delays.

With 34e808c2dbe6510fabacd07228c4bc552e197876 more details should be available in the displayed error message. Closing this again as the user issue could not be reproduced and the exception in my case was expected and valid. Please, feel free to post comments and/or reopen the issue if there is something new on the topic.

I'm using the latest Docker Image of Admiral and I've still hit this problem on Manjaro x64 and Docker 18.09.

I've enabled the tcp:// directive in the /etc/docker/daemon.json file and from browser or via curl I can send requests and receive responses.

I paste here what the logs say:

[270][I][2019-01-07T10:32:35.914Z][46][EndpointCertificateUtil][validateSslTrust][Using non secure channel, skipping SSL validation for http://0.0.0.0:2375/v1.24]
[271][I][2019-01-07T10:32:35.917Z][17][RemoteApiDockerAdapterCommandExecutorImpl][hostPing][Ping host: http://0.0.0.0:2375/v1.24/_ping]
**[272][W][2019-01-07T10:32:35.934Z][78][8282/adapters/host-docker-service][fail][Fail: java.lang.Exception: Connection refused: /0.0.0.0:2375; Reason: {"message":"Connection refused: /0.0.0.0:2375"**,"stackTrace":["sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)","sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)","io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:352)","io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:340)","io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:632)","io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:579)","io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:496)","io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:458)","io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)","java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)","java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)","java.lang.Thread.run(Thread.java:748)"],"statusCode":500,"details":["SHOULD_RETRY"],"documentKind":"com:vmware:xenon:common:ServiceErrorResponse","errorCode":0}
	at com.vmware.admiral.adapter.docker.service.AbstractDockerAdapterService.fail(AbstractDockerAdapterService.java:374)
	at com.vmware.admiral.adapter.docker.service.DockerHostAdapterService.lambda$directPing$18(DockerHostAdapterService.java:529)
	at com.vmware.admiral.adapter.docker.service.RemoteApiDockerAdapterCommandExecutorImpl.lambda$hostPing$3(RemoteApiDockerAdapterCommandExecutorImpl.java:473)
	at com.vmware.xenon.common.Operation.completeOrFail(Operation.java:1331)
	at com.vmware.xenon.common.Operation.fail(Operation.java:1309)
	at com.vmware.xenon.common.Operation.fail(Operation.java:1237)
	at com.vmware.xenon.common.http.netty.NettyHttpServiceClient.fail(NettyHttpServiceClient.java:725)
	at com.vmware.xenon.common.http.netty.NettyHttpServiceClient.lambda$connectChannel$0(NettyHttpServiceClient.java:421)
	at com.vmware.xenon.common.Operation.lambda$nestCompletion$1(Operation.java:1362)
	at com.vmware.xenon.common.Operation.completeOrFail(Operation.java:1331)
	at com.vmware.xenon.common.Operation.fail(Operation.java:1309)
	at com.vmware.xenon.common.http.netty.NettyChannelPool.fail(NettyChannelPool.java:579)
	at com.vmware.xenon.common.http.netty.NettyChannelPool.lambda$connectOrReuse$1(NettyChannelPool.java:356)
	at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:507)
	at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:500)
	at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:479)
	at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:420)
	at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:122)
	at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.fulfillConnectPromise(AbstractNioChannel.java:327)
	at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:343)
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:632)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:579)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:496)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:458)
	at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: /0.0.0.0:2375
	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
	at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:352)
	at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:340)
	... 8 more
Caused by: java.net.ConnectException: Connection refused
	... 12 more
]
[273][W][2019-01-07T10:32:35.934Z][78][8282/resources/hosts][lambda$sendAdapterRequest$25][Error sending adapter request with type Host.Container.Ping : Error connecting to http://0.0.0.0:2375: ]
[274][W][2019-01-07T10:32:35.936Z][17][8282][startQueryTask][No result limit set on the query: {"taskInfo":{"isDirect":true},"querySpec":{"query":{"occurance":"MUST_OCCUR","booleanClauses":[{"occurance":"MUST_OCCUR","term":{"propertyName":"documentKind","matchValue":"com:vmware:photon:controller:model:resources:ComputeService:ComputeState","matchType":"TERM"}},{"occurance":"MUST_OCCUR","term":{"propertyName":"resourcePoolLink","matchValue":"/resources/pools/d4883d50f2d4f07557edbbf4b29a8","matchType":"TERM"}}]},"options":["COUNT"]},"indexLink":"/core/document-index","nodeSelectorLink":"/core/node-selectors/default","documentVersion":0,"documentUpdateTimeMicros":0,"documentExpirationTimeMicros":0}. Defaulting to 10000]
[275][I][2019-01-07T10:32:35.951Z][17][8282/resources/pools/d4883d50f2d4f07557edbbf4b29a8][handleDelete][Deleting ResourcePool, Path: /resources/pools/d4883d50f2d4f07557edbbf4b29a8, Operation ID: 6375, Referrer: http://172.17.0.2:8282/resources/clusters]

I believe the error relies on how the address is parsed in code, I've highlighted the potential error as shown in log: it tries to connect to the address "/0.0.0.0:2735' while looking for PING - this should be the error, as I get 'OK' in the browser

Hi Michele,

This is happening when you try to add docker host, right? What is strange is the ip address in the log message you pasted

Can you confirm or send screenshots you have entered the valid host address, and what does it look like?

hi @lazarin , I've written http://0.0.0.0:2375 in the form, when adding a new host, but I get the connection error in the UI and in the logs it comes like that.
The http://0.0.0.0:2375/v1.24/_ping address is valid and it returns 200 - OK.

If I write only 0.0.0.0:2375 in the field, it triggers an error on Connection Refused, because I didn't set a TLS secured endpoint, as I defaults to https://

OK, thank you, now I got it.

You won't be able to use 0.0.0.0 nor 127.0.0.1 as from inside the container (admiral) this means the admiral itself. To add that docker host you have to use an ip address accessible from within the admiral container.