OpenTelemetry client may fail after all tests finish
orsinium opened this issue · 1 comments
When all tests have passed, I've got a traceback from the OpenTelemetry client. Seems like there are races between the client and pytest shutting down the server. I can't quite reproduce it, though, it happened to me only once.
Exception while exporting Span batch.
Traceback (most recent call last):
File "/home/gram/.local/share/hatch/env/virtual/appsignal-beta/OY3JtW1S/appsignal-beta/lib/python3.10/site-packages/urllib3/connection.py", line 200, in _new_conn
sock = connection.create_connection(
File "/home/gram/.local/share/hatch/env/virtual/appsignal-beta/OY3JtW1S/appsignal-beta/lib/python3.10/site-packages/urllib3/util/connection.py", line 85, in create_connection
raise err
File "/home/gram/.local/share/hatch/env/virtual/appsignal-beta/OY3JtW1S/appsignal-beta/lib/python3.10/site-packages/urllib3/util/connection.py", line 73, in create_connection
sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/gram/.local/share/hatch/env/virtual/appsignal-beta/OY3JtW1S/appsignal-beta/lib/python3.10/site-packages/urllib3/connectionpool.py", line 790, in urlopen
response = self._make_request(
File "/home/gram/.local/share/hatch/env/virtual/appsignal-beta/OY3JtW1S/appsignal-beta/lib/python3.10/site-packages/urllib3/connectionpool.py", line 496, in _make_request
conn.request(
File "/home/gram/.local/share/hatch/env/virtual/appsignal-beta/OY3JtW1S/appsignal-beta/lib/python3.10/site-packages/urllib3/connection.py", line 388, in request
self.endheaders()
File "/usr/lib/python3.10/http/client.py", line 1277, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/usr/lib/python3.10/http/client.py", line 1037, in _send_output
self.send(msg)
File "/usr/lib/python3.10/http/client.py", line 975, in send
self.connect()
File "/home/gram/.local/share/hatch/env/virtual/appsignal-beta/OY3JtW1S/appsignal-beta/lib/python3.10/site-packages/urllib3/connection.py", line 236, in connect
self.sock = self._new_conn()
File "/home/gram/.local/share/hatch/env/virtual/appsignal-beta/OY3JtW1S/appsignal-beta/lib/python3.10/site-packages/urllib3/connection.py", line 215, in _new_conn
raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f9a6b70f7f0>: Failed to establish a new connection: [Errno 111] Connection refused
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/gram/.local/share/hatch/env/virtual/appsignal-beta/OY3JtW1S/appsignal-beta/lib/python3.10/site-packages/requests/adapters.py", line 486, in send
resp = conn.urlopen(
File "/home/gram/.local/share/hatch/env/virtual/appsignal-beta/OY3JtW1S/appsignal-beta/lib/python3.10/site-packages/urllib3/connectionpool.py", line 844, in urlopen
retries = retries.increment(
File "/home/gram/.local/share/hatch/env/virtual/appsignal-beta/OY3JtW1S/appsignal-beta/lib/python3.10/site-packages/urllib3/util/retry.py", line 515, in increment
raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type]
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=8099): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f9a6b70f7f0>: Failed to establish a new connection: [Errno 111] Connection refused'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/gram/.local/share/hatch/env/virtual/appsignal-beta/OY3JtW1S/appsignal-beta/lib/python3.10/site-packages/opentelemetry/sdk/trace/export/__init__.py", line 368, in _export_batch
self.span_exporter.export(self.spans_list[:idx]) # type: ignore
File "/home/gram/.local/share/hatch/env/virtual/appsignal-beta/OY3JtW1S/appsignal-beta/lib/python3.10/site-packages/opentelemetry/exporter/otlp/proto/http/trace_exporter/__init__.py", line 153, in export
resp = self._export(serialized_data)
File "/home/gram/.local/share/hatch/env/virtual/appsignal-beta/OY3JtW1S/appsignal-beta/lib/python3.10/site-packages/opentelemetry/exporter/otlp/proto/http/trace_exporter/__init__.py", line 124, in _export
return self._session.post(
File "/home/gram/.local/share/hatch/env/virtual/appsignal-beta/OY3JtW1S/appsignal-beta/lib/python3.10/site-packages/requests/sessions.py", line 637, in post
return self.request("POST", url, data=data, json=json, **kwargs)
File "/home/gram/.local/share/hatch/env/virtual/appsignal-beta/OY3JtW1S/appsignal-beta/lib/python3.10/site-packages/requests/sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
File "/home/gram/.local/share/hatch/env/virtual/appsignal-beta/OY3JtW1S/appsignal-beta/lib/python3.10/site-packages/requests/sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
File "/home/gram/.local/share/hatch/env/virtual/appsignal-beta/OY3JtW1S/appsignal-beta/lib/python3.10/site-packages/requests/adapters.py", line 519, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=8099): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f9a6b70f7f0>: Failed to establish a new connection: [Errno 111] Connection refused'))
Good catch! I haven't seen this one before.
We do force the AppSignal agent to shut down in a not-too-polite way after each test. This error looks to me like the OpenTelemetry exporter is still trying to send data to it, and failing.
There's probably something we can call on the OpenTelemetry SDK, after each test, to ask it to shut itself down. If we did that before shutting down the agent, that should avoid this.
(We could also stub out the calls to start the OpenTelemetry SDK entirely, or at least not register the OpenTelemetry exporter when running the tests -- the unit tests don't actually care about anything that the OpenTelemetry SDK does)