openwallet-foundation/owl-agent-test-harness

Agent(s) fail to start within 30 seconds

JamesKEbert opened this issue · 1 comments

Unfortunately, when trying to execute the runsets for aries-vcx or acapy, the agents fail to start (while testing #833). The following is what I am attempting on a linux machine:
./manage runset ariesvcx-ariesvcx -b

This fails with:
Running runset: ariesvcx-ariesvcx, using agents: -d aries-vcx
Scope of tests: -t @RFC0036,@RFC0037,@RFC0160,@RFC0023,@revocation -t ~@T002-HIPE0011 -t ~@T014-HIPE001 -t ~@wip -t ~@RFC0183 -t ~@DIDExchangeConnection -t ~@RFC0211 -t ~@RFC0453 -t ~@RFC0025 -t ~@RFC0434 -t ~@QualifiedDIDs

Agents to build: aries-vcx
Backchannel Folder: aries-backchannels/aries-vcx
Building aries-vcx-agent-backchannel ...
[+] Building 0.8s (5/5) FINISHED                                                                                                docker:desktop-linux
 => [internal] load build definition from Dockerfile.aries-vcx                                                                                  0.0s
 => => transferring dockerfile: 129B                                                                                                            0.0s
 => [internal] load metadata for ghcr.io/hyperledger/aries-vcx/aath-backchannel:0.64.0-gmaath-064-update-7597                                   0.4s
 => [internal] load .dockerignore                                                                                                               0.1s
 => => transferring context: 269B                                                                                                               0.0s
 => CACHED [1/1] FROM ghcr.io/hyperledger/aries-vcx/aath-backchannel:0.64.0-gmaath-064-update-7597@sha256:2556044b4c09aae2d6a4ecd0034e23e2e970  0.0s
 => exporting to image                                                                                                                          0.1s
 => => exporting layers                                                                                                                         0.0s
 => => writing image sha256:6ea7b74bb8d582b090cf2de5f44ff71ffabc625eb83e62bb9632a7e1ca9de1ba                                                    0.0s
 => => naming to docker.io/library/aries-vcx-agent-backchannel                                                                                  0.0s

Build multi-platform images faster with Docker Build Cloud: https://docs.docker.com/go/docker-build-cloud
Building aries-test-harness ...
[+] Building 5.7s (11/11) FINISHED                                                                                              docker:desktop-linux
 => [internal] load build definition from Dockerfile.harness                                                                                    0.0s
 => => transferring dockerfile: 349B                                                                                                            0.0s
 => [internal] load metadata for docker.io/library/python:3.7-buster                                                                            5.5s
 => [internal] load .dockerignore                                                                                                               0.0s
 => => transferring context: 148B                                                                                                               0.0s
 => [1/6] FROM docker.io/library/python:3.7-buster@sha256:2539f956bcccbac5e4a48ebdafbbbfbd26b4ab56e65b96076ae9cd1188b119b3                      0.0s
 => [internal] load build context                                                                                                               0.0s
 => => transferring context: 6.42kB                                                                                                             0.0s
 => CACHED [2/6] COPY ./requirements.txt /aries-test-harness/                                                                                   0.0s
 => CACHED [3/6] WORKDIR /aries-test-harness                                                                                                    0.0s
 => CACHED [4/6] RUN pip install -r requirements.txt                                                                                            0.0s
 => CACHED [5/6] RUN     echo "==> Install stuff not in the requirements..."   &&     pip install --no-cache-dir         aiohttp                0.0s
 => CACHED [6/6] COPY . /aries-test-harness                                                                                                     0.0s
 => exporting to image                                                                                                                          0.0s
 => => exporting layers                                                                                                                         0.0s
 => => writing image sha256:0870116332cb9c53943f71b017ceff1c350bd875da96d97b93421a8287a2fdb1                                                    0.0s
 => => naming to docker.io/library/aries-test-harness                                                                                           0.0s

Build multi-platform images faster with Docker Build Cloud: https://docs.docker.com/go/docker-build-cloud
Agents to be used:
  Acme - aries-vcx
  Bob - aries-vcx
  Faber - aries-vcx
  Mallory - aries-vcx

starting local von-network...
[+] Running 14/14
 ✔ Network von_von                Created                                                                                                       0.1s 
 ✔ Volume "von_node2-data"        Created                                                                                                       0.0s 
 ✔ Volume "von_node3-data"        Created                                                                                                       0.0s 
 ✔ Volume "von_node4-data"        Created                                                                                                       0.0s 
 ✔ Volume "von_client-data"       Created                                                                                                       0.0s 
 ✔ Volume "von_webserver-cli"     Created                                                                                                       0.0s 
 ✔ Volume "von_webserver-ledger"  Created                                                                                                       0.0s 
 ✔ Volume "von_nodes-data"        Created                                                                                                       0.0s 
 ✔ Volume "von_node1-data"        Created                                                                                                       0.0s 
 ✔ Container von-node3-1          Started                                                                                                       0.4s 
 ✔ Container von-node4-1          Started                                                                                                       0.4s 
 ✔ Container von-webserver-1      Started                                                                                                       0.4s 
 ✔ Container von-node1-1          Started                                                                                                       0.4s 
 ✔ Container von-node2-1          Started                                                                                                       0.4s 
Want to see the scrolling container logs? Run "./manage logs"
waiting for ledger to start/respond............
starting local uniresolver...
Using: docker --log-level error compose
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  3091  100  3091    0     0   967k      0 --:--:-- --:--:-- --:--:-- 1006k
[+] Running 2/2
 ✔ Container uni-resolver-web.local  Started                                                                                                    0.1s 
 ✔ Container driver-did-sov.local    Started                                                                                                    0.2s 
starting local indy-tails-server...
Using: docker --log-level error compose
[+] Running 3/3
 ✔ Network docker_tails-server            Created                                                                                               0.1s 
 ✔ Container docker-ngrok-tails-server-1  Started                                                                                               0.3s 
 ✔ Container docker-tails-server-1        Started                                                                                               0.3s 
waiting for tails server to start.
waiting for uniresolver to start.....
Starting Acme Agent using aries-vcx-agent-backchannel ...
Starting Bob Agent using aries-vcx-agent-backchannel ...
Starting Faber Agent using aries-vcx-agent-backchannel ...
Starting Mallory Agent using aries-vcx-agent-backchannel ...

waiting for Acme agent to start..............................
The agent failed to start within 30 seconds.


waiting for Bob agent to start...............................
The agent failed to start within 30 seconds.


waiting for Faber agent to start..............................
The agent failed to start within 30 seconds.


waiting for Mallory agent to start..............................
The agent failed to start within 30 seconds.


<test results here>
<agent and services shutdown here>

The logs when inspecting acme_agent for Aries VCX:

thread 'main' panicked at aries/agents/aath-backchannel/src/setup.rs:38:14:
Failed to send message: reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Ipv4(192.168.65.9)), port: Some(9000), path: "/register", query: None, fragment: None }, source: hyper_util::client::legacy::Error(Connect, ConnectError("tcp connect error", Os { code: 110, kind: TimedOut, message: "Operation timed out" })) }
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
The logs when inspecting acme_agent for ACA-Py:
Starting aca-py agent ...
Error loading genesis transactions:
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/aiohttp/connector.py", line 1025, in _wrap_create_connection
    return await self._loop.create_connection(*args, **kwargs)
  File "/usr/local/lib/python3.9/asyncio/base_events.py", line 1065, in create_connection
    raise exceptions[0]
  File "/usr/local/lib/python3.9/asyncio/base_events.py", line 1050, in create_connection
    sock = await self._connect_sock(
  File "/usr/local/lib/python3.9/asyncio/base_events.py", line 961, in _connect_sock
    await self.sock_connect(sock, address)
  File "/usr/local/lib/python3.9/asyncio/selector_events.py", line 500, in sock_connect
    return await fut
  File "/usr/local/lib/python3.9/asyncio/selector_events.py", line 535, in _sock_connect_cb
    raise OSError(err, f'Connect call failed {address}')
TimeoutError: [Errno 110] Connect call failed ('192.168.65.9', 9000)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/aries-backchannels/python/agent_backchannel.py", line 73, in default_genesis_txns
    async with session.get(f"{ledger_url}/genesis") as resp:
  File "/usr/local/lib/python3.9/site-packages/aiohttp/client.py", line 1197, in __aenter__
    self._resp = await self._coro
  File "/usr/local/lib/python3.9/site-packages/aiohttp/client.py", line 581, in _request
    conn = await self._connector.connect(
  File "/usr/local/lib/python3.9/site-packages/aiohttp/connector.py", line 544, in connect
    proto = await self._create_connection(req, traces, timeout)
  File "/usr/local/lib/python3.9/site-packages/aiohttp/connector.py", line 944, in _create_connection
    _, proto = await self._create_direct_connection(req, traces, timeout)
  File "/usr/local/lib/python3.9/site-packages/aiohttp/connector.py", line 1257, in _create_direct_connection
    raise last_exc
  File "/usr/local/lib/python3.9/site-packages/aiohttp/connector.py", line 1226, in _create_direct_connection
    transp, proto = await self._wrap_create_connection(
  File "/usr/local/lib/python3.9/site-packages/aiohttp/connector.py", line 1033, in _wrap_create_connection
    raise client_error(req.connection_key, exc) from exc
aiohttp.client_exceptions.ClientConnectorError: Cannot connect to host 192.168.65.9:9000 ssl:default [Connect call failed ('192.168.65.9', 9000)]
Error retrieving ledger genesis transactions

This to me smells of docker networking issues (as the agents fail to start when reaching out to the local von network from the looks of it). Any ideas?

It was a docker networking issue. By adding DOCKERHOST=host.docker.internal before the command everything worked as expected. Thanks for the suggestion @gmulhearn!