Ostorlab/oxo

Tracker agent crash on MQ connection error

3asm opened this issue · 3 comments

3asm commented

Describe the bug
An exception is raised by the Tracker emit when the connection times out causing the tracker to fail.


Exception in thread Thread-2:

Traceback (most recent call last):

  File "/app/agent/tracker_agent.py", line 61, in start

    self.timeout_queues_checking(self.postscane_done_timeout_sec)

  File "/app/agent/tracker_agent.py", line 90, in timeout_queues_checking

    raise TimeoutError()

TimeoutError

 

During handling of the above exception, another exception occurred:

 

Traceback (most recent call last):

  File "/usr/local/lib/python3.8/site-packages/aiormq/base.py", line 25, in __inner

    return await self.task

asyncio.exceptions.CancelledError

 

The above exception was the direct cause of the following exception:

 

Traceback (most recent call last):

  File "/usr/local/lib/python3.8/threading.py", line 932, in _bootstrap_inner

    self.run()

  File "/usr/local/lib/python3.8/threading.py", line 870, in run

    self._target(*self._args, **self._kwargs)

  File "/app/agent/tracker_agent.py", line 64, in start

    self.emit('v3.report.event.post_scan.timeout', {})

  File "/usr/local/lib/python3.8/site-packages/ostorlab/agent/agent.py", line 223, in emit

    self.emit_raw(selector, message.raw)

  File "/usr/local/lib/python3.8/site-packages/ostorlab/agent/agent.py", line 246, in emit_raw

    self.mq_send_message(selector, raw)

  File "/usr/local/lib/python3.8/site-packages/ostorlab/agent/mixins/agent_mq_mixin.py", line 136, in mq_send_message

    self._loop.run_until_complete(self.async_mq_send_message(key, message, message_priority))

  File "/usr/local/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete

    return future.result()

  File "/usr/local/lib/python3.8/asyncio/tasks.py", line 695, in _wrap_awaitable

    return (yield from awaitable.__await__())

  File "/usr/local/lib/python3.8/site-packages/aiormq/base.py", line 27, in __inner

    raise self.exception from e

  File "/usr/local/lib/python3.8/site-packages/aiormq/base.py", line 168, in wrap

    return await self.create_task(func(self, *args, **kwargs))

  File "/usr/local/lib/python3.8/site-packages/aiormq/base.py", line 27, in __inner

    raise self.exception from e

  File "/usr/local/lib/python3.8/site-packages/ostorlab/agent/mixins/agent_mq_mixin.py", line 123, in async_mq_send_message

    exchange = await self._get_exchange(channel)

  File "/usr/local/lib/python3.8/site-packages/ostorlab/agent/mixins/agent_mq_mixin.py", line 55, in _get_exchange

    return await channel.declare_exchange(self._topic,

  File "/usr/local/lib/python3.8/site-packages/aio_pika/robust_channel.py", line 125, in declare_exchange

    exchange = await super().declare_exchange(

  File "/usr/local/lib/python3.8/site-packages/aio_pika/channel.py", line 246, in declare_exchange

    await exchange.declare(timeout=timeout)

  File "/usr/local/lib/python3.8/site-packages/aio_pika/exchange.py", line 81, in declare

    return await asyncio.wait_for(

  File "/usr/local/lib/python3.8/asyncio/tasks.py", line 455, in wait_for

    return await fut

  File "/usr/local/lib/python3.8/site-packages/aiormq/channel.py", line 591, in exchange_declare

    return await self.rpc(

  File "/usr/local/lib/python3.8/site-packages/aiormq/base.py", line 168, in wrap

    return await self.create_task(func(self, *args, **kwargs))

  File "/usr/local/lib/python3.8/site-packages/aiormq/base.py", line 27, in __inner

    raise self.exception from e

  File "/usr/local/lib/python3.8/asyncio/tasks.py", line 695, in _wrap_awaitable

    return (yield from awaitable.__await__())

  File "/usr/local/lib/python3.8/site-packages/aiormq/base.py", line 27, in __inner

    raise self.exception from e

  File "/usr/local/lib/python3.8/asyncio/tasks.py", line 695, in _wrap_awaitable

    return (yield from awaitable.__await__())

  File "/usr/local/lib/python3.8/site-packages/aiormq/base.py", line 27, in __inner

    raise self.exception from e

  File "/usr/local/lib/python3.8/site-packages/aiormq/connection.py", line 423, in __close_writer

    await writer.wait_closed()

  File "/usr/local/lib/python3.8/asyncio/streams.py", line 359, in wait_closed

    await self._protocol._get_close_waiter(self)

  File "/usr/local/lib/python3.8/site-packages/aiormq/connection.py", line 375, in __reader

    weight, channel, frame = await self.__receive_frame()

  File "/usr/local/lib/python3.8/site-packages/aiormq/connection.py", line 327, in __receive_frame

    frame_header = await self.reader.readexactly(1)

  File "/usr/local/lib/python3.8/asyncio/streams.py", line 723, in readexactly

    await self._wait_for_data('readexactly')

  File "/usr/local/lib/python3.8/asyncio/streams.py", line 517, in _wait_for_data

    await self._waiter

  File "/usr/local/lib/python3.8/asyncio/selector_events.py", line 910, in write

    n = self._sock.send(data)

ConnectionResetError: [Errno 104] Connection reset by peer

no mater i try to with only 3 agent except the openvas also hit the same error

when I tried on my laptop the following command :
ostorlab scan run --install --agent agent/ostorlab/nmap --agent agent/ostorlab/openvas --agent agent/ostorlab/tsunami --agent agent/ostorlab/nuclei ip 8.8.8.8

The agent tracker is crashing with the following logs:

    Exception in thread Thread-2 (start):
    Traceback (most recent call last):
    File "/usr/local/lib/python3.10/site-packages/aiormq/connection.py", line 228, in connect
        self.reader, self.writer = await asyncio.open_connection(
    File "/usr/local/lib/python3.10/asyncio/streams.py", line 48, in open_connection
        transport, _ = await loop.create_connection(
    File "/usr/local/lib/python3.10/asyncio/base_events.py", line 1076, in create_connection
        raise exceptions[0]
    File "/usr/local/lib/python3.10/asyncio/base_events.py", line 1060, in create_connection
        sock = await self._connect_sock(
    File "/usr/local/lib/python3.10/asyncio/base_events.py", line 969, in _connect_sock
        await self.sock_connect(sock, address)
    File "/usr/local/lib/python3.10/asyncio/selector_events.py", line 501, in sock_connect
        return await fut
    File "/usr/local/lib/python3.10/asyncio/selector_events.py", line 541, in _sock_connect_cb
        raise OSError(err, f'Connect call failed {address}')
    ConnectionRefusedError: [Errno 111] Connect call failed ('10.0.3.2', 5672)
agent_ostorlab_tracker_v015_3.1.7endacvfdgnv@docker-desktop    |
    The above exception was the direct cause of the following exception:
agent_ostorlab_tracker_v015_3.1.7endacvfdgnv@docker-desktop    |
    Traceback (most recent call last):
    File "/usr/local/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
        self.run()
    File "/usr/local/lib/python3.10/threading.py", line 953, in run
        self._target(*self._args, **self._kwargs)
    File "/app/agent/tracker_agent.py", line 58, in start
        self.emit("v3.report.event.scan.done", {})
    File "/usr/local/lib/python3.10/site-packages/ostorlab/agent/mixins/agent_open_telemetry_mixin.py", line 274, in emit
        super().emit(selector, data)
    File "/usr/local/lib/python3.10/site-packages/ostorlab/agent/agent.py", line 292, in emit
        self.emit_raw(selector, message.raw, message_id=message_id)
    File "/usr/local/lib/python3.10/site-packages/ostorlab/agent/agent.py", line 334, in emit_raw
        self.mq_send_message(selector, control_message)
    File "/usr/local/lib/python3.10/site-packages/ostorlab/agent/mixins/agent_mq_mixin.py", line 165, in mq_send_message
        self._loop.run_until_complete(
    File "/usr/local/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
        return future.result()
    File "/usr/local/lib/python3.10/site-packages/ostorlab/agent/mixins/agent_mq_mixin.py", line 147, in async_mq_send_message
        async with self._channel_pool.acquire() as channel:
    File "/usr/local/lib/python3.10/site-packages/aio_pika/pool.py", line 142, in __aenter__
        self.item = await self.pool._get()
    File "/usr/local/lib/python3.10/site-packages/aio_pika/pool.py", line 104, in _get
        return await self._create_item()
    File "/usr/local/lib/python3.10/site-packages/aio_pika/pool.py", line 92, in _create_item
        item = await self.__constructor(*self.__constructor_args)
    File "/usr/local/lib/python3.10/site-packages/ostorlab/agent/mixins/agent_mq_mixin.py", line 57, in _get_channel
        async with self._connection_pool.acquire() as connection:
    File "/usr/local/lib/python3.10/site-packages/aio_pika/pool.py", line 142, in __aenter__
        self.item = await self.pool._get()
    File "/usr/local/lib/python3.10/site-packages/aio_pika/pool.py", line 104, in _get
        return await self._create_item()
    File "/usr/local/lib/python3.10/site-packages/aio_pika/pool.py", line 92, in _create_item
        item = await self.__constructor(*self.__constructor_args)
    File "/usr/local/lib/python3.10/site-packages/ostorlab/agent/mixins/agent_mq_mixin.py", line 54, in _get_connection
        return await aio_pika.connect_robust(url=self._url, loop=self._loop)
    File "/usr/local/lib/python3.10/site-packages/aio_pika/robust_connection.py", line 271, in connect_robust
        return await connect(
    File "/usr/local/lib/python3.10/site-packages/aio_pika/connection.py", line 333, in connect
        await connection.connect(
    File "/usr/local/lib/python3.10/site-packages/aio_pika/robust_connection.py", line 127, in connect
        result = await super().connect(
    File "/usr/local/lib/python3.10/site-packages/aio_pika/connection.py", line 120, in connect
        self.connection = await asyncio.wait_for(
    File "/usr/local/lib/python3.10/asyncio/tasks.py", line 408, in wait_for
        return await fut
    File "/usr/local/lib/python3.10/site-packages/aio_pika/connection.py", line 105, in _make_connection
        connection = await aiormq.connect(self.url, **kwargs)
    File "/usr/local/lib/python3.10/site-packages/aiormq/connection.py", line 542, in connect
        await connection.connect(client_properties or {})
    File "/usr/local/lib/python3.10/site-packages/aiormq/base.py", line 168, in wrap
        return await self.create_task(func(self, *args, **kwargs))
    File "/usr/local/lib/python3.10/site-packages/aiormq/base.py", line 25, in __inner
        return await self.task
    File "/usr/local/lib/python3.10/site-packages/aiormq/connection.py", line 232, in connect
        raise ConnectionError(*e.args) from e
    ConnectionError: [Errno 111] Connect call failed ('10.0.3.2', 5672)
    Traceback (most recent call last):
    File "/usr/local/lib/python3.10/site-packages/aiormq/connection.py", line 228, in connect
        self.reader, self.writer = await asyncio.open_connection(
    File "/usr/local/lib/python3.10/asyncio/streams.py", line 48, in open_connection
        transport, _ = await loop.create_connection(
    File "/usr/local/lib/python3.10/asyncio/base_events.py", line 1076, in create_connection
        raise exceptions[0]
    File "/usr/local/lib/python3.10/asyncio/base_events.py", line 1060, in create_connection
        sock = await self._connect_sock(
    File "/usr/local/lib/python3.10/asyncio/base_events.py", line 969, in _connect_sock
        await self.sock_connect(sock, address)
    File "/usr/local/lib/python3.10/asyncio/selector_events.py", line 501, in sock_connect
        return await fut
    File "/usr/local/lib/python3.10/asyncio/selector_events.py", line 541, in _sock_connect_cb
        raise OSError(err, f'Connect call failed {address}')
    ConnectionRefusedError: [Errno 111] Connect call failed ('10.0.3.2', 5672)
agent_ostorlab_tracker_v015_3.1.7endacvfdgnv@docker-desktop    |
    The above exception was the direct cause of the following exception:
agent_ostorlab_tracker_v015_3.1.7endacvfdgnv@docker-desktop    |
    Traceback (most recent call last):
    File "/app/agent/tracker_agent.py", line 107, in <module>
        TrackerAgent.main()
    File "/usr/local/lib/python3.10/site-packages/ostorlab/agent/agent.py", line 414, in main
        instance.run()
    File "/usr/local/lib/python3.10/site-packages/ostorlab/agent/agent.py", line 171, in run
        self._loop.run_until_complete(self.mq_run())
    File "/usr/local/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
        return future.result()
    File "/usr/local/lib/python3.10/site-packages/ostorlab/agent/mixins/agent_mq_mixin.py", line 84, in mq_run
        connection = await self._get_connection()
    File "/usr/local/lib/python3.10/site-packages/ostorlab/agent/mixins/agent_mq_mixin.py", line 54, in _get_connection
        return await aio_pika.connect_robust(url=self._url, loop=self._loop)
    File "/usr/local/lib/python3.10/site-packages/aio_pika/robust_connection.py", line 271, in connect_robust
        return await connect(
    File "/usr/local/lib/python3.10/site-packages/aio_pika/connection.py", line 333, in connect
        await connection.connect(
    File "/usr/local/lib/python3.10/site-packages/aio_pika/robust_connection.py", line 127, in connect
        result = await super().connect(
    File "/usr/local/lib/python3.10/site-packages/aio_pika/connection.py", line 120, in connect
        self.connection = await asyncio.wait_for(
    File "/usr/local/lib/python3.10/asyncio/tasks.py", line 408, in wait_for
        return await fut
    File "/usr/local/lib/python3.10/site-packages/aio_pika/connection.py", line 105, in _make_connection
        connection = await aiormq.connect(self.url, **kwargs)
    File "/usr/local/lib/python3.10/site-packages/aiormq/connection.py", line 542, in connect
        await connection.connect(client_properties or {})
    File "/usr/local/lib/python3.10/site-packages/aiormq/base.py", line 168, in wrap
        return await self.create_task(func(self, *args, **kwargs))
    File "/usr/local/lib/python3.10/site-packages/aiormq/base.py", line 25, in __inner
        return await self.task
    File "/usr/local/lib/python3.10/site-packages/aiormq/connection.py", line 232, in connect
        raise ConnectionError(*e.args) from e
    ConnectionError: [Errno 111] Connect call failed ('10.0.3.2', 5672)

The common issue in all these scenarios is related to the 'async_mq_send_message' method:

When the tracker agent restarts, the connection is restored.

Since this issue seems to occur primarily on machines with limited resources, it could be attributed to two main factors:

  1. CPU-related problems:

    • High CPU usage can lead to delays in processing incoming requests and, in extreme cases, cause the server to become unresponsive. This can impact the server's ability to handle new connections and existing ones efficiently.
  2. Memory-related problems:

    • When a server runs low on available memory, it may struggle to allocate resources for incoming connections, resulting in slower response times and potential service failures.
    • Sufficient memory is essential for buffering data during network communication. Insufficient memory can lead to connection problems when trying to buffer incoming or outgoing data.