nucypher/nucypher

Unhandled error (Unreachable)

Closed this issue · 3 comments

Describe the Bug
The TACo node throws an unhandled error. This situation happens around 3-4 times every 5 minutes.

To Reproduce
Run TACo node and wait.

Traceback or Screenshots (Optional)

Unhandled error while learning from (Ursula)xxxxxxxxxxxxxxxxxxx (0xaaaaaaaaaaaaaaaaa) (hex=....................):Node EXEMPT_FROM_VERIFICATION <IP ADDRESS>:9151 is unreachable: No response from <IP ADDRESS>:9151.
Unhandled error during node learning: Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/twisted/python/context.py", line 117, in callWithContext
    return self.currentContext().callWithContext(ctx, func, *args, **kw)
  File "/usr/local/lib/python3.8/site-packages/twisted/python/context.py", line 82, in callWithContext
    return func(*args, **kw)
  File "/usr/local/lib/python3.8/site-packages/twisted/internet/defer.py", line 900, in callback
    self._startRunCallbacks(result)
  File "/usr/local/lib/python3.8/site-packages/twisted/internet/defer.py", line 1007, in _startRunCallbacks
    self._runCallbacks()
--- <exception caught here> ---
  File "/usr/local/lib/python3.8/site-packages/twisted/internet/defer.py", line 1101, in _runCallbacks
    current.result = callback(  # type: ignore[misc]
  File "/usr/local/lib/python3.8/site-packages/nucypher/network/nodes.py", line 615, in _discover_or_abort
    result = self.learn_from_teacher_node(eager=False, canceller=self._discovery_canceller)
  File "/usr/local/lib/python3.8/site-packages/nucypher/network/nodes.py", line 810, in learn_from_teacher_node
    response = self.network_middleware.get_nodes_via_rest(node=current_teacher,
  File "/usr/local/lib/python3.8/site-packages/nucypher/network/middleware.py", line 342, in get_nodes_via_rest
    response = self.client.post(node_or_sprout=node,
  File "/usr/local/lib/python3.8/site-packages/nucypher/network/middleware.py", line 175, in method_wrapper
    host, port, http_client = self.verify_and_parse_node_or_host_and_port(node_or_sprout, host, port)
  File "/usr/local/lib/python3.8/site-packages/nucypher/network/middleware.py", line 102, in verify_and_parse_node_or_host_and_port
    node.verify_node(
  File "/usr/local/lib/python3.8/site-packages/nucypher/network/nodes.py", line 1175, in verify_node
    response_data = network_middleware_client.node_information(host=self.rest_interface.host,
  File "/usr/local/lib/python3.8/site-packages/nucypher/network/middleware.py", line 155, in node_information
    response = self.get(
  File "/usr/local/lib/python3.8/site-packages/nucypher/network/middleware.py", line 179, in method_wrapper
    response = self._execute_method(node_or_sprout,
  File "/usr/local/lib/python3.8/site-packages/nucypher/network/middleware.py", line 240, in _execute_method
    raise RestMiddleware.Unreachable(
nucypher.network.middleware.Unreachable: Node EXEMPT_FROM_VERIFICATION <IP ADDRESS>:9151 is unreachable: No response from <IP ADDRESS>:9151

System (please complete the following information):

  • Platform: Kubernetes
  • TACo Version: v7.0.4

Related Issues

There is very old issue regarding the same problem: #2922

Hi there 👋🏻 thanks for opening the issue. This appears to be an attempt to contact an downed node while discovering peers. Unfortunately, sometimes a peer is unreachable.

For clarification, does this cause the node to crash, or does it continue to run?

In any case -- the amount/format of node discovery logs is needlessly alarming and far too verbose. If your node continues running, I'd say this is a duplicate of #1712 and you can simply carry on running the node normally.

does this cause the node to crash, or does it continue to run?

No, it doesn't. The node keeps working.

However, this issue has existed for a very long time. It is so much visible and occurs very often.
It is difficult to view the logs having so many stack traces.
Please, add it to your work plan.

Yes, it is indeed very noisy! #1712 is planned for the next minor release #3361 (https://github.com/orgs/nucypher/projects/46). Closing this issue and marking it as a duplicate of #1712 .