Unhandled `JSONDecodeError` when Twitter API returns 429
mikelei8291 opened this issue ยท 9 comments
Describe the bug
I have setup a systemd service to monitor the start of Twitter spaces, and the time interval of running the command is 20 seconds. I do have multiple services monitoring multiple accounts, so it's normal to see some 429 Too Many Requests
returned from the Twitter API. However, the response content didn't seem to be valid JSON (likely HTML), and it would cause a JSONDecodeError
to be raised.
To Reproduce
twspace_dl --input-cookie-file "$twitter_cookies" -suU "https://twitter.com/$username" -o '/tmp/twspace_dl-%(creator_screen_name)s' -m -v
Expected behavior
It's not a big issue, but I think it would be better if this can be handled properly. Maybe retry after a timeout, or a better error message?
Output (Note: username and user ID are replaced by placeholders in the following log output)
2022-06-20 09:02:39,132 [DEBUG] Starting new HTTPS connection (1): cdn.syndication.twimg.com:443
2022-06-20 09:02:39,148 [DEBUG] https://cdn.syndication.twimg.com:443 "GET /widgets/followbutton/info.json?screen_names=<username> HTTP/1.1" 200 178
2022-06-20 09:02:39,149 [DEBUG] Starting new HTTPS connection (1): twitter.com:443
2022-06-20 09:02:39,360 [DEBUG] https://twitter.com:443 "GET /i/api/fleets/v1/avatar_content?user_ids=<user_id>&only_spaces=true HTTP/1.1" 429 0
Traceback (most recent call last):
File "/usr/local/lib/python3.9/dist-packages/requests/models.py", line 910, in json
return complexjson.loads(self.text, **kwargs)
File "/usr/lib/python3/dist-packages/simplejson/__init__.py", line 525, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3/dist-packages/simplejson/decoder.py", line 370, in decode
obj, end = self.raw_decode(s)
File "/usr/lib/python3/dist-packages/simplejson/decoder.py", line 400, in raw_decode
return self.scan_once(s, idx=_w(s, idx).end())
simplejson.errors.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/bin/twspace_dl", line 8, in <module>
sys.exit(main())
File "/usr/local/lib/python3.9/dist-packages/twspace_dl/__main__.py", line 199, in main
args.func(args)
File "/usr/local/lib/python3.9/dist-packages/twspace_dl/__main__.py", line 63, in space
twspace = Twspace.from_user_avatar(args.user_url, auth_token)
File "/usr/local/lib/python3.9/dist-packages/twspace_dl/twspace.py", line 231, in from_user_avatar
avatar_content = requests.get(
File "/usr/local/lib/python3.9/dist-packages/requests/models.py", line 917, in json
raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)
requests.exceptions.JSONDecodeError: [Errno Expecting value] : 0
Desktop (please complete the following information):
- OS: Linux
- Version: Ubuntu 21.10
- Installation method: pip
For the timeout, how much time would be adequate? I'll try to work on the error message.
Thanks for the quick response. I think a 5~10 seconds timeout would be enough, cause I have 7 service instances running with a 3 seconds time interval between each of them, and no others were reporting 429 when this error was produced.
However, I think it would also be OK if it just print some meaningful error message and/or return a custom exit code, cause the command will be ran again by systemd after a set interval. For the custom exit code, according to this guide, the program could use anything in the range other than the reserved ones, or could use C/C++ standards defined in /usr/include/sysexits.h
. Suitable ones include (from /usr/include/sysexits.h
):
#define EX_UNAVAILABLE 69 /* service unavailable */
#define EX_TEMPFAIL 75 /* temp failure; user is invited to retry */
Though this may only apply to Linux.
I don't think I'm gonna use a custom exit code because all of python's methods for doing so are pretty hacky (hooking into the exception handling system)
Here's the error handling code
try:
avatar_content = avatar_content_res.json()
except JSONDecodeError as err:
logging.debug(avatar_content_res.text)
raise LookupError(
(
"Response is incorrect JSON! "
"You probably hit Twitter's ratelimit, retry later."
)
) from err
Yeah, it can be hacky, and you may have done something different with the exception handling so no worries, but I think it's possible to catch these exceptions in __main__.space()
and call exit()
with the custom exit code. It's just easier to determine the status of the program in scripts.
For the error handling code, I think it would be better to check the value of avatar_content_res.status_code
first to make sure it's requests.codes.ok
(200) then call .json()
on it, and check if it was requests.codes.too_many
(429) to make it certain that a rate limit had been reached. Also, I think ValueError
fits the context better here, cause LookupError
is a base class and it's mostly used for key and index errors for mappings.
That makes sense. Thank you for your input, I'll do that.
if avatar_content_res.ok:
avatar_content = avatar_content_res.json()
else:
logging.debug(avatar_content_res.text)
if avatar_content_res.status_code == requests.codes.too_many:
raise ValueError(
(
"Response status code is 429! "
"You hit Twitter's ratelimit, retry later."
)
)
if 400 <= avatar_content_res.status_code < 500:
raise ValueError(
"Response code is in the 4XX range. Bad request on our side"
)
if avatar_content_res.status_code >= 500:
raise ValueError(
"Response code is over 500. There was an error on Twitter's side"
)
raise ValueError("Can't get proper response")
Looks great! Thanks for the quick response and fix!
alright I'll commit it. I don't think I'll make a release just for it though.
Thank you so much! I'll build from source so no worries.