appsignal/appsignal-python

OpenTelemetry client may fail after all tests finish

orsinium opened this issue · 1 comments

When all tests have passed, I've got a traceback from the OpenTelemetry client. Seems like there are races between the client and pytest shutting down the server. I can't quite reproduce it, though, it happened to me only once.

Exception while exporting Span batch.
Traceback (most recent call last):
  File "/home/gram/.local/share/hatch/env/virtual/appsignal-beta/OY3JtW1S/appsignal-beta/lib/python3.10/site-packages/urllib3/connection.py", line 200, in _new_conn
    sock = connection.create_connection(
  File "/home/gram/.local/share/hatch/env/virtual/appsignal-beta/OY3JtW1S/appsignal-beta/lib/python3.10/site-packages/urllib3/util/connection.py", line 85, in create_connection
    raise err
  File "/home/gram/.local/share/hatch/env/virtual/appsignal-beta/OY3JtW1S/appsignal-beta/lib/python3.10/site-packages/urllib3/util/connection.py", line 73, in create_connection
    sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/gram/.local/share/hatch/env/virtual/appsignal-beta/OY3JtW1S/appsignal-beta/lib/python3.10/site-packages/urllib3/connectionpool.py", line 790, in urlopen
    response = self._make_request(
  File "/home/gram/.local/share/hatch/env/virtual/appsignal-beta/OY3JtW1S/appsignal-beta/lib/python3.10/site-packages/urllib3/connectionpool.py", line 496, in _make_request
    conn.request(
  File "/home/gram/.local/share/hatch/env/virtual/appsignal-beta/OY3JtW1S/appsignal-beta/lib/python3.10/site-packages/urllib3/connection.py", line 388, in request
    self.endheaders()
  File "/usr/lib/python3.10/http/client.py", line 1277, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/lib/python3.10/http/client.py", line 1037, in _send_output
    self.send(msg)
  File "/usr/lib/python3.10/http/client.py", line 975, in send
    self.connect()
  File "/home/gram/.local/share/hatch/env/virtual/appsignal-beta/OY3JtW1S/appsignal-beta/lib/python3.10/site-packages/urllib3/connection.py", line 236, in connect
    self.sock = self._new_conn()
  File "/home/gram/.local/share/hatch/env/virtual/appsignal-beta/OY3JtW1S/appsignal-beta/lib/python3.10/site-packages/urllib3/connection.py", line 215, in _new_conn
    raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f9a6b70f7f0>: Failed to establish a new connection: [Errno 111] Connection refused

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/gram/.local/share/hatch/env/virtual/appsignal-beta/OY3JtW1S/appsignal-beta/lib/python3.10/site-packages/requests/adapters.py", line 486, in send
    resp = conn.urlopen(
  File "/home/gram/.local/share/hatch/env/virtual/appsignal-beta/OY3JtW1S/appsignal-beta/lib/python3.10/site-packages/urllib3/connectionpool.py", line 844, in urlopen
    retries = retries.increment(
  File "/home/gram/.local/share/hatch/env/virtual/appsignal-beta/OY3JtW1S/appsignal-beta/lib/python3.10/site-packages/urllib3/util/retry.py", line 515, in increment
    raise MaxRetryError(_pool, url, reason) from reason  # type: ignore[arg-type]
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=8099): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f9a6b70f7f0>: Failed to establish a new connection: [Errno 111] Connection refused'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/gram/.local/share/hatch/env/virtual/appsignal-beta/OY3JtW1S/appsignal-beta/lib/python3.10/site-packages/opentelemetry/sdk/trace/export/__init__.py", line 368, in _export_batch
    self.span_exporter.export(self.spans_list[:idx])  # type: ignore
  File "/home/gram/.local/share/hatch/env/virtual/appsignal-beta/OY3JtW1S/appsignal-beta/lib/python3.10/site-packages/opentelemetry/exporter/otlp/proto/http/trace_exporter/__init__.py", line 153, in export
    resp = self._export(serialized_data)
  File "/home/gram/.local/share/hatch/env/virtual/appsignal-beta/OY3JtW1S/appsignal-beta/lib/python3.10/site-packages/opentelemetry/exporter/otlp/proto/http/trace_exporter/__init__.py", line 124, in _export
    return self._session.post(
  File "/home/gram/.local/share/hatch/env/virtual/appsignal-beta/OY3JtW1S/appsignal-beta/lib/python3.10/site-packages/requests/sessions.py", line 637, in post
    return self.request("POST", url, data=data, json=json, **kwargs)
  File "/home/gram/.local/share/hatch/env/virtual/appsignal-beta/OY3JtW1S/appsignal-beta/lib/python3.10/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
  File "/home/gram/.local/share/hatch/env/virtual/appsignal-beta/OY3JtW1S/appsignal-beta/lib/python3.10/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
  File "/home/gram/.local/share/hatch/env/virtual/appsignal-beta/OY3JtW1S/appsignal-beta/lib/python3.10/site-packages/requests/adapters.py", line 519, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=8099): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f9a6b70f7f0>: Failed to establish a new connection: [Errno 111] Connection refused'))
unflxw commented

Good catch! I haven't seen this one before.

We do force the AppSignal agent to shut down in a not-too-polite way after each test. This error looks to me like the OpenTelemetry exporter is still trying to send data to it, and failing.

There's probably something we can call on the OpenTelemetry SDK, after each test, to ask it to shut itself down. If we did that before shutting down the agent, that should avoid this.

(We could also stub out the calls to start the OpenTelemetry SDK entirely, or at least not register the OpenTelemetry exporter when running the tests -- the unit tests don't actually care about anything that the OpenTelemetry SDK does)