sonic182/aiosonic

RuntimeError: readuntil() called while another coroutine is already waiting for incoming data

geraldog opened this issue · 3 comments

Describe the bug
Persistent occasions where my crawler gets:
RuntimeError: readuntil() called while another coroutine is already waiting for incoming data

The stack trace is irrelevant to trace the bug here. It comes from https://github.com/python/cpython/blob/d9efa45d7457b0dfea467bb1c2d22c69056ffc73/Lib/asyncio/streams.py#L525 but that itself explains little.

After days of coding and tracing with print() I found out that even cancelling the waiter so we don't raise the RuntimeError on streams.py is pointless. And that the real reason for the bug is connection.close() is being called twice from dfferent code-paths, a concurrency mess.

To Reproduce
Steps to reproduce the behavior:

  1. Go to your crawler you have written on top of aiosonic
  2. Start the crawler with at least a decent concurrency of say, 300 "clients" on the pool
  3. Remember to catch your exceptions
  4. See error

Expected behavior
Not raising RuntimeError by calling readuntil() or read() - any of the stream reading awaitables that consume from the StreamReader buffer of bytes object - twice on top of each other.

Screenshots
None

Desktop (please complete the following information):
Not applicable

Smartphone (please complete the following information):
Not applicable

Additional context
Hi @sonic182 and sorry for the delay in filing the Issue.
I wanted to have a fix before discussing any of this.
I have a draft of a fix. Will file the PR within today.
Thanks for Everything!

Draft fix is at #474

Fixed in 0.19.0

Hi @sonic182

I'm test-crawling the top 1 million Cloudflare Radar domains.

In the end we alleviated the problem a lot but it seems after a million domains I end up with around 20 RuntimeError's. Not much, maybe one every 50,000 domains or so but still worth fixing on #483