Error handling in FC2LiveStream.get_meta
15532th opened this issue · 3 comments
I got an exception in my instance of autofc2 with following stacktrace:
2023-03-18 16:24:24 [fc2 [redacted]] Fetching stream info
Task exception was never retrieved
future: <Task finished name='Task-2733449' coro=<AutoFC2.handle_channel() done, defined at [redacted]/lib/python3.11/site-packages/fc2_live_dl/autofc2.py:90>
exception=KeyError('data')>
Traceback (most recent call last):
File "[redacted]/lib/python3.11/site-packages/fc2_live_dl/autofc2.py", line 93, in handle_channel
await fc2.download(channel_id)
File "[redacted]/lib/python3.11/site-packages/fc2_live_dl/FC2LiveDL.py", line 97, in download
await live.wait_for_online(self.params["wait_poll_interval"])
File "[redacted]/lib/python3.11/site-packages/fc2_live_dl/fc2.py", line 209, in wait_for_online
while not await self.is_online():
^^^^^^^^^^^^^^^^^^^^^^
File "[redacted]/lib/python3.11/site-packages/fc2_live_dl/fc2.py", line 215, in is_online
meta = await self.get_meta(refetch=refetch)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "[redacted]/lib/python3.11/site-packages/fc2_live_dl/fc2.py", line 277, in get_meta
data["data"]["channel_data"]["title"]
~~~~^^^^^^^^
KeyError: 'data'
2023-03-30 06:50:59 [fc2 [redacted]] Fetching stream info
Since there is "Fetching stream info" line after stacktrace, I assume handle_channel
task got finished after unhandled exception and was restarted by AutoFC2._main
, so whole thing would probably continue working fine after that. But as get_meta
makes network request, parses and processes response, it would make sense to handle cases where something of that goes wrong.
The response that caused exception must have been a valid json, since it was successfully decoded, so it might be some kind of 50x error, making it related to #10. I tried to trigger different network-related errors by blocking internet access and pointing api url to different places, and result is generally the same: unhandled exception in get_meta
causing restart of handle_channel
task.
From my understanding, it would make sense to check for non-200 status, catch KeyError (valid json without expected fields), aiohttp.client_exceptions.ContentTypeError (not a valid json), asyncio.TimeoutError (in case of network timeout) or any other exception and raise custom exception with appropriate error text, which could then be caught in FC2LiveDL.download
.
FC2LiveStream.wait_for_online
seems like a good place to handle exponential backoff from #10, but which of possible errors should cause retry and which should be raised immediately? Also, what should be lower and upper limit on delay before retrying? Since there is wait_poll_interval
in wait_for_online
context, it would make sense to start with it, but should upper bound involve hardcoded value of 5-10 minutes, be multiple of wait_poll_interval
, or both?
Hi, thanks for the detailed report. Sorry it took a while for me to respond, somehow this issue got buried.
Yeah I think we can add some retry and exponential backoff at FC2LiveStream.wait_for_online
. For simplicity's sake maybe we can just catch all exceptions apart from KeyboardInterrupt
. The exact number for the backoff doesn't matter too much but I usually go from 1 second, doubling until some hardcoded limit. I think 5 minutes is a good number.
Let me know if you'd like to make a PR for this.
Yes, I would make a PR, unless you would find doing it yourself faster than discussing all the details.
In the context of FC2LiveStream.wait_for_online
, there is not enough information to provide a meaningful error message, and using a generic one while hiding the stacktrace would make debugging abnormalities substantially harder. I suggest wrapping either every step of get_meta
in try-except clause that would print a warning with appropriate details and then re-raise/raise custom exception, which could then be caught in wait_for_online
and other places using get_meta
.
Ended up doing it in wait_for_online
, since it's indeed simpler. Hopefully the way warning about exception is printed will still provide enough information.
Additionally get_meta
now calls raise_for_status
to get non-200 status code error reported as such.
Perhaps #10 can also be closed now?