RhinoSecurityLabs/dsnap

[BUG] Stuck on download when network failure occurs

Opened this issue · 1 comments

Describe the bug
When a network error occurs (read timeout), dsnap is not able to recover when finishing the downloads and is stuck at downloading of last X blocks. This is most likely caused by some lock mechanism where it thinks the errored thread is still running and downloading those few last blocks but the thread already exited with an exception (my theory).
Note the number of network errors in log is much greater (10+) then the count of missing blocks remaining for downloads.

... 
Exception in thread Thread-3 (<lambda>):
Traceback (most recent call last):
  File "/home/intense/dsnap/venv/lib/python3.10/site-packages/urllib3/response.py", line 444, in _error_catcher
    yield
  File "/home/intense/dsnap/venv/lib/python3.10/site-packages/urllib3/response.py", line 567, in read
    data = self._fp_read(amt) if not fp_closed else b""
  File "/home/intense/dsnap/venv/lib/python3.10/site-packages/urllib3/response.py", line 533, in _fp_read
    return self._fp.read(amt) if amt is not None else self._fp.read()
  File "/usr/lib/python3.10/http/client.py", line 482, in read
    s = self._safe_read(self.length)
  File "/usr/lib/python3.10/http/client.py", line 631, in _safe_read
    data = self.fp.read(amt)
  File "/usr/lib/python3.10/socket.py", line 705, in readinto
    return self._sock.recv_into(b)
  File "/usr/lib/python3.10/ssl.py", line 1303, in recv_into
    return self.read(nbytes, buffer)
  File "/usr/lib/python3.10/ssl.py", line 1159, in read
    return self._sslobj.read(len, buffer)
TimeoutError: The read operation timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/intense/dsnap/venv/lib/python3.10/site-packages/botocore/response.py", line 99, in read
    chunk = self._raw_stream.read(amt)
  File "/home/intense/dsnap/venv/lib/python3.10/site-packages/urllib3/response.py", line 566, in read
    with self._error_catcher():
  File "/usr/lib/python3.10/contextlib.py", line 153, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/home/intense/dsnap/venv/lib/python3.10/site-packages/urllib3/response.py", line 449, in _error_catcher
    raise ReadTimeoutError(self._pool, None, "Read timed out.")
urllib3.exceptions.ReadTimeoutError: AWSHTTPSConnectionPool(host='ebs.us-east-1.amazonaws.com', port=443): Read timed out.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.10/threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "/home/intense/dsnap/venv/lib/python3.10/site-packages/dsnap/snapshot.py", line 127, in <lambda>
    t = Thread(target=lambda: self._run(func))
  File "/home/intense/dsnap/venv/lib/python3.10/site-packages/dsnap/snapshot.py", line 148, in _run
    raise e
  File "/home/intense/dsnap/venv/lib/python3.10/site-packages/dsnap/snapshot.py", line 139, in _run
    f(block)
  File "/home/intense/dsnap/venv/lib/python3.10/site-packages/dsnap/snapshot.py", line 178, in download
    b.fetch().write()
  File "/home/intense/dsnap/venv/lib/python3.10/site-packages/dsnap/snapshot.py", line 43, in write
    data = self.BlockData.read()
  File "/home/intense/dsnap/venv/lib/python3.10/site-packages/botocore/response.py", line 102, in read
    raise ReadTimeoutError(endpoint_url=e.url, error=e)
botocore.exceptions.ReadTimeoutError: Read timeout on endpoint URL: "None"
^Cved block 5451 of 5454
Aborted!

Expected behavior
Detect stuck download threads of blocks, clean them up after a timeout and re-download the block

Desktop (please complete the following information):

  • WSL2 environment
  • Linux HOSTNAME 5.15.153.1-microsoft-standard-WSL2 #1 SMP Fri Mar 29 23:14:13 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

Same happening here, its stuck at that block with many "Read timeout errors" before that one.

    return self._sslobj.read(len, buffer)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TimeoutError: The read operation timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\ProgramData\anaconda3\Lib\site-packages\botocore\response.py", line 99, in read
    chunk = self._raw_stream.read(amt)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\ProgramData\anaconda3\Lib\site-packages\urllib3\response.py", line 566, in read
    with self._error_catcher():
  File "C:\ProgramData\anaconda3\Lib\contextlib.py", line 155, in __exit__
    self.gen.throw(typ, value, traceback)
  File "C:\ProgramData\anaconda3\Lib\site-packages\urllib3\response.py", line 449, in _error_catcher
    raise ReadTimeoutError(self._pool, None, "Read timed out.")
urllib3.exceptions.ReadTimeoutError: AWSHTTPSConnectionPool(host='ebs.us-east-1.amazonaws.com', port=443): Read timed out.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\ProgramData\anaconda3\Lib\threading.py", line 1038, in _bootstrap_inner
    self.run()
  File "C:\ProgramData\anaconda3\Lib\threading.py", line 975, in run
    self._target(*self._args, **self._kwargs)
  File "C:\ProgramData\anaconda3\Lib\site-packages\dsnap\snapshot.py", line 127, in <lambda>
    t = Thread(target=lambda: self._run(func))
                              ^^^^^^^^^^^^^^^
  File "C:\ProgramData\anaconda3\Lib\site-packages\dsnap\snapshot.py", line 148, in _run
    raise e
  File "C:\ProgramData\anaconda3\Lib\site-packages\dsnap\snapshot.py", line 139, in _run
    f(block)
  File "C:\ProgramData\anaconda3\Lib\site-packages\dsnap\snapshot.py", line 178, in download
    b.fetch().write()
  File "C:\ProgramData\anaconda3\Lib\site-packages\dsnap\snapshot.py", line 43, in write
    data = self.BlockData.read()
           ^^^^^^^^^^^^^^^^^^^^^
  File "C:\ProgramData\anaconda3\Lib\site-packages\botocore\response.py", line 102, in read
    raise ReadTimeoutError(endpoint_url=e.url, error=e)
botocore.exceptions.ReadTimeoutError: Read timeout on endpoint URL: "None"
Saved block 61402 of 61429```