docker/for-mac

Docker desktop 3.3.1 cannot connect to exposed ports

Closed this issue ยท 21 comments

  • [ x] I have tried with the latest version of Docker Desktop
  • [x ] I have tried disabling enabled experimental features
  • I have uploaded Diagnostics
  • Diagnostics ID:

Expected behavior

Docker containers that have ports exposed should be accessible vial the loopback host.

Actual behavior

Containers are no longer reachable. I can hop on a container and access all containers on the docker network, but no access via localhost.

Information

  • macOS Version: 11.2.2
  • Intel chip or Apple chip: Intel
  • Docker Desktop Version: 3.3.1

Steps to reproduce the behavior

  1. Docker stacks that worked without issues till 3.2.2 became inaccessible with 3.3.1
  2. Reverting back to 3.2.2 made them all available.
  3. Does 3.3 series implement a k8s style access policy? In other words do I need to run some kind of proxy service to make the containers accessible via localhost?
djs55 commented

@sptrakesh thanks for your report. Could you provide a small repro example which I could use to investigate?

Here is a simple stack file that stopped working when I had upgraded to 3.3.1. Could not access service on port 2000...

version: '3.7'

services:
  mongo:
    image: mongo
    ports:
      - "27017:27017"
    environment:
      MONGO_INITDB_ROOT_USERNAME: test
      MONGO_INITDB_ROOT_PASSWORD: test
    volumes:
      - $DATA_DIR/mongo:/data/db

  mongo-service:
    image: sptrakesh/mongo-service
    stop_signal: SIGTERM
    ports:
      - "2000:2000"
    environment:
      - "MONGO_URI=mongodb://test:test@mongo/admin?authSource=admin&compressors=snappy&w=1&maxPoolSize=1000&maxIdleTimeMS=30000"
      - VERSION_HISTORY_DATABASE=versionHistory
      - VERSION_HISTORY_COLLECTION=entities
      - METRICS_COLLECTION=metrics
      - LOG_LEVEL=debug
      - THREADS=8
    volumes:
      - $DATA_DIR/mongo-service:/opt/spt/logs

I can also reproduce the issue on some machines with:

docker swarm init
docker service create --publish 10000:8080 jmalloc/echo-server
curl -v http://localhost:10000

On success the output of curl is:

*   Trying ::1...
* TCP_NODELAY set
* Connected to localhost (::1) port 10000 (#0)
> GET / HTTP/1.1
> Host: localhost:10000
> User-Agent: curl/7.64.1
> Accept: */*
> 
< HTTP/1.1 200 OK
< Content-Type: text/plain
< Date: Thu, 22 Apr 2021 05:12:32 GMT
< Content-Length: 107
< 
Request served by 67008e9afa38

HTTP/1.1 GET /

Host: localhost:10000
User-Agent: curl/7.64.1
Accept: */*

* Connection #0 to host localhost left intact
* Closing connection 0

Whereas some machines produce:

   Trying ::1...
* TCP_NODELAY set
* Connected to localhost (::1) port 10000 (#0)
> GET / HTTP/1.1
> Host: localhost:10000
> User-Agent: curl/7.64.1
> Accept: */*
> 
* Empty reply from server
* Connection #0 to host localhost left intact
curl: (52) Empty reply from server
* Closing connection 0

One of the machines was fixed by leaving and re-initializing the swarm, but this has not been universally successful.

I had tried re-initialising the swarm, but with no effect.

djs55 commented

Thanks for the extra info. If it's an intermittent problem or only affects some machines, then it might be caused by a bug in the iptables rules in 3.3.1. Could you try the latest developer build:

Note that neither of these builds have been notarized. If it still doesn't work, could you reproduce the problem, upload a diagnostics report and quote the ID here? Thanks in advance!

This is happening to me as well. And apparently it's not intermittent. Since this morning it's always happening.

@djs55 I tried with the latest developer build as suggested but the issue is still there

Here's the diagnostic ID for my report: EC234789-274A-4136-927F-AD7DF63EC863/20210423084007

I havethe same issue.

using docker stack deploy --compose-file=docker-compose.yml db
and a standard docker hub image mysql:5.6.51 in a docker compose file exposing port 3306:3306 you cannot connect from the host machine. If you start it manually with docker-compose up it works. This worked fine on previous versions of docker desktop for mac, only once I upgraded to 3.3.1 did it fail.

Anecdotally, it seems that several of my colleagues report NOT being prompted for their desktop's password during the upgrade. Some were prompted after some combination of restarting Docker Desktop and/or their laptops, after which ingress started working again. Others have had success with a complete reinstallation.

My team had the same issue, this is how we fixed it without downgrading:

  1. Under settings->resources-> network change to default (192.168.65.0/28) -> restart
  2. Run docker swarm leave --force
  3. Restart docker again (!)
  4. Run docker swarm init

Same issue here, just before a meeting!! :S

My team had the same issue, this is how we fixed it without downgrading:

1. Under settings->resources-> network change to default (192.168.65.0/28) -> restart

2. Run `docker swarm leave --force`

3. Restart docker again (!)

4. Run `docker swarm init`

This worked for us!!

Additionally, i restored de network icdr back to original (/24) and followed the same process so we can end up as it originally was and it worked fine! :)

We too facing same issue. But below solution didn't work for me.

My team had the same issue, this is how we fixed it without downgrading:

  1. Under settings->resources-> network change to default (192.168.65.0/28) -> restart
  2. Run docker swarm leave --force
  3. Restart docker again (!)
  4. Run docker swarm init

Everyone on this thread is using docker stack deploy. Can you confirm that the bug is not seen using docker-compose up with the same compose file?

Maybe a bit off topic, but I'm also interested in why people would choose docker stack deploy over docker-compose up on a single machine.

@stephen-turner I can confirm, docker stack deploy failed where docker-compose up worked with the exact same compose file.

We use docker stack for the multiple container support of the same instance, to more closely mirror production and so we can test our multi container features are working properly, i.e. connecting to the swarm and sharing data between containers

May or may not be relevant, but I can confirm the same issue when using docker-compose instead of docker stack deploy.

Additionally (at least for me) it does appear to be an intermittent issue, although it fails the vast majority of the time (success rate something like 10%)

The steps at #5610 (comment) worked for me. Thanks @amirvaza!

macOS 11.3.1, Intel chip, Docker Desktop Version: 3.3.3 (64133), Engine 20.10.6, Compose 1.29.1

djs55 commented

I think most of the problems seen here are caused by the internal network changing and breaking swarm. As an experiment I created a swarm service (docker service create --publish 8080:80 nginx), waited for it to start working, then I changed the network range in the Preferences UI, waited for the application to restart and discovered that the ports never become reachable. Furthermore I see the old IP address in the dockerd.log:

% less ~/Library/Containers/com.docker.docker/Data/log/vm/dockerd.log
...
... level=warning msg="grpc: addrConn.createTransport failed to connect to {192.168.65.3:2377  <nil> 0 <nil>}. Err :connection error: desc = \"transport: Error while dialing dial tcp 192.168.65.3:2377: i/o timeout\". Reconnecting..." module=grpc

There have been some tweaks to the network configuration recently to accommodate the new virtualization.framework experimental feature. I suspect these have triggered this problem over upgrade.

The Swarm docs mention that managers must have a static IP. This means that changing the network range will always require a

docker swarm leave --force 
# Now restart Docker Desktop because of a related network bug
docker swarm init

as in @amirvaza 's comment above.

I'll investigate how we could make this process simpler.

In the meanwhile, could everyone who is having an issue follow the advice in @amirvaza 's comment above, reinitialize the swarm and re-deploy your service and check whether it works. If it does not, it must be a different issue. In that case could you file a separate ticket (to avoid confusion), upload diagnostics, quote the ID and link it to/from this one?

@sebtoombs do I understand correctly that you have a problem with docker-compose, but that docker stack deploy seems to work? The code paths are different so I think the bug is probably different too. Could you file a separate ticket describing your docker-compose scenario, upload some diagnostics and quote the ID?

Thanks!

@djs55 not to confirm with certainty, but adding thoughts which are consistent with your comment: I suspect changing networks when dialing into our VPN or disconnecting might cause such issues. I couldn't reproduce, yet.

@djs55 The instructions to change to the default network fixed the issue for me. I just updated to the latest 3.3.3, and am able to use my stack like I used to with 3.2 releases.

Closed issues are locked after 30 days of inactivity.
This helps our team focus on active issues.

If you have found a problem that seems similar to this, please open a new issue.

Send feedback to Docker Community Slack channels #docker-for-mac or #docker-for-windows.
/lifecycle locked