WikiMovimentoBrasil/deadlinkchecker

Script throws an error on pages with 100 or more links

Closed this issue · 2 comments

Alwoch commented

The script splits the links into batches of 10 / 15 (This can be changed) when there are more than 15 links on a page. Notably, its only able to return responses for the first 2 when the batch size is set to 10 even though it sends the 3rd batch to the server. when given a batch size of 15, it only returns the first batch and the second batch fails with the response sent

[2024-01-15 15:37:39,992] ERROR in app: Exception on /checklinks [POST]
Traceback (most recent call last):
  File "/data/project/deadlinkchecker/www/python/venv/lib/python3.11/site-packages/flask/app.py", line 2529, in wsgi_app
    response = self.full_dispatch_request()
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/project/deadlinkchecker/www/python/venv/lib/python3.11/site-packages/flask/app.py", line 1825, in full_dispatch_request
    rv = self.handle_user_exception(e)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/project/deadlinkchecker/www/python/venv/lib/python3.11/site-packages/flask_cors/extension.py", line 176, in wrapped_function
    return cors_after_request(app.make_response(f(*args, **kwargs)))
                                                ^^^^^^^^^^^^^^^^^^
  File "/data/project/deadlinkchecker/www/python/venv/lib/python3.11/site-packages/flask/app.py", line 1823, in full_dispatch_request
    rv = self.dispatch_request()
         ^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/project/deadlinkchecker/www/python/venv/lib/python3.11/site-packages/flask/app.py", line 1799, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/project/deadlinkchecker/www/python/venv/lib/python3.11/site-packages/asgiref/sync.py", line 277, in __call__
    return call_result.result()
           ^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/data/project/deadlinkchecker/www/python/venv/lib/python3.11/site-packages/asgiref/sync.py", line 353, in main_wrap
    result = await self.awaitable(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/project/deadlinkchecker/www/python/src/link_checker.py", line 50, in check_link
    results = await asyncio.gather(*tasks)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/project/deadlinkchecker/www/python/src/link_checker.py", line 27, in make_request
    async with session.get(url[1], ssl=False) as response:
  File "/data/project/deadlinkchecker/www/python/venv/lib/python3.11/site-packages/aiohttp/client.py", line 1167, in __aenter__
    self._resp = await self._coro
                 ^^^^^^^^^^^^^^^^
  File "/data/project/deadlinkchecker/www/python/venv/lib/python3.11/site-packages/aiohttp/client.py", line 562, in _request
    conn = await self._connector.connect(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/project/deadlinkchecker/www/python/venv/lib/python3.11/site-packages/aiohttp/connector.py", line 540, in connect
    proto = await self._create_connection(req, traces, timeout)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/project/deadlinkchecker/www/python/venv/lib/python3.11/site-packages/aiohttp/connector.py", line 901, in _create_connection
    _, proto = await self._create_direct_connection(req, traces, timeout)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/project/deadlinkchecker/www/python/venv/lib/python3.11/site-packages/aiohttp/connector.py", line 1147, in _create_direct_connection
    assert port is not None
           ^^^^^^^^^^^^^^^^
AssertionError
Alwoch commented
[2024-01-27 21:24:51,868] ERROR in app: Exception on /checklinks [POST]
Traceback (most recent call last):
  File "D:\deadlinkchecker\venv\Lib\site-packages\flask\app.py", line 1455, in wsgi_app
    response = self.full_dispatch_request()
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\deadlinkchecker\venv\Lib\site-packages\flask\app.py", line 869, in full_dispatch_request
    rv = self.handle_user_exception(e)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\deadlinkchecker\venv\Lib\site-packages\flask_cors\extension.py", line 176, in wrapped_function
    return cors_after_request(app.make_response(f(*args, **kwargs)))
                                                ^^^^^^^^^^^^^^^^^^
  File "D:\deadlinkchecker\venv\Lib\site-packages\flask\app.py", line 867, in full_dispatch_request
    rv = self.dispatch_request()
         ^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\deadlinkchecker\venv\Lib\site-packages\flask\app.py", line 852, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\deadlinkchecker\venv\Lib\site-packages\asgiref\sync.py", line 277, in __call__
    return call_result.result()
           ^^^^^^^^^^^^^^^^^^^^
  File "C:\Python312\Lib\concurrent\futures\_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "C:\Python312\Lib\concurrent\futures\_base.py", line 401, in __get_result
    raise self._exception
  File "D:\deadlinkchecker\venv\Lib\site-packages\asgiref\sync.py", line 353, in main_wrap
    result = await self.awaitable(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\deadlinkchecker\src\link_checker.py", line 69, in check_link
    results = await asyncio.gather(*tasks)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\deadlinkchecker\src\link_checker.py", line 42, in make_request
    async with session.get(url[1], ssl=False) as response:
  File "D:\deadlinkchecker\venv\Lib\site-packages\aiohttp\client.py", line 1187, in __aenter__
    self._resp = await self._coro
                 ^^^^^^^^^^^^^^^^
  File "D:\deadlinkchecker\venv\Lib\site-packages\aiohttp\client.py", line 601, in _request
    await resp.start(conn)
  File "D:\deadlinkchecker\venv\Lib\site-packages\aiohttp\client_reqrep.py", line 960, in start
    with self._timer:
  File "D:\deadlinkchecker\venv\Lib\site-packages\aiohttp\helpers.py", line 735, in __exit__
    raise asyncio.TimeoutError from None
TimeoutError
127.0.0.1 - - [27/Jan/2024 21:24:52] "POST /checklinks HTTP/1.1" 500 -