Handling broken connections
Opened this issue · 4 comments
I can't open a stream very long before I face a broken connection / incomplete read. After that the sentiment continues to print, but the same exact value every time.
I guess there are two issues here;
- I'm sending requests too often
- The app can't handle this error and doesn't attempt to reconnect
How often can I send requests?
Any best practice to handle this by reconnecting?
`python3 "/Users/main/Desktop/black_magic/sentiment/sentiment_plotter.py"
/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/tweetfeels/tweetdata.py:146: FutureWarning: pd.TimeGrouper is deprecated and will be removed; Please use pd.Grouper(freq=...)
df = df.groupby(pd.TimeGrouper(freq=f'{int(binsize/second)}S')).size()
Exception in thread Thread-2:
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 581, in _readinto_chunked
n = self._safe_readinto(mvb)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 628, in _safe_readinto
raise IncompleteRead(bytes(mvb[0:total_bytes]), len(b))
http.client.IncompleteRead: IncompleteRead(0 bytes read, 512 more expected)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/urllib3/response.py", line 360, in _error_catcher
yield
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/urllib3/response.py", line 442, in read
data = self._fp.read(amt)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 447, in read
n = self.readinto(b)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 481, in readinto
return self._readinto_chunked(b)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 592, in _readinto_chunked
raise IncompleteRead(bytes(b[0:total_bytes]))
http.client.IncompleteRead: IncompleteRead(0 bytes read)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/threading.py", line 917, in _bootstrap_inner
self.run()
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/threading.py", line 865, in run
self._target(*self._args, **self._kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/tweepy/streaming.py", line 301, in _run
six.reraise(*exc_info)
File "/Users/main/Library/Python/3.7/lib/python/site-packages/six.py", line 693, in reraise
raise value
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/tweepy/streaming.py", line 270, in _run
self._read_loop(resp)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/tweepy/streaming.py", line 320, in _read_loop
line = buf.read_line()
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/tweepy/streaming.py", line 181, in read_line
self._buffer += self._stream.read(self._chunk_size)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/urllib3/response.py", line 459, in read
raise IncompleteRead(self._fp_bytes_read, self.length_remaining)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/contextlib.py", line 130, in exit
self.gen.throw(type, value, traceback)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/urllib3/response.py", line 378, in _error_catcher
raise ProtocolError('Connection broken: %r' % e, e)
urllib3.exceptions.ProtocolError: ('Connection broken: IncompleteRead(0 bytes read)', IncompleteRead(0 bytes read))
^CException ignored in: <module 'threading' from '/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/threading.py'>
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/threading.py", line 1273, in _shutdown
t.join()
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/threading.py", line 1032, in join
self._wait_for_tstate_lock()
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/threading.py", line 1048, in _wait_for_tstate_lock
elif lock.acquire(block, timeout):
KeyboardInterrupt
➜ ~`
The twitter api is rate limited: https://developer.twitter.com/en/docs/basics/rate-limiting.html
There's nothing you can do to get around that. You are right that the library does not attempt to reconnect when the connection breaks. That's not really in my list of priorities at the moment, but I'd welcome a pull-request if you feel like implementing it. You should also upgrade tweetfeels to 0.4.1 to take care of the TimeGrouper
warning:
pip install tweetfeels --upgrade
Thanks for your reply.
Looking further into this issue it seems its not that I'm hitting their api limit - I think the issue is that my machine is failing to consume data as fast as it is produced. Changing to less popular keywords helped.
See this thread Streaming.py Crash on Incomplete Read Error when tweets are very high.
I'll see what I can do in regards to implementing it.
Thanks for reporting this. I may look into using an alternative storage method for tweets if it will allow it to consume tweets at a higher rate. Otherwise, I’m unsure how to fix. Seems the problem is largely dependent on the hardware and connection you use, so maybe it shouldn’t be an issue with the module.
I built my own sentiment tool just for fun and implemented a way to handle reconnections. It has been running on a vps for 9 days straight after reconnecting within 5 seconds 49 times. I looked quickly through your code and didn't see any obvious way for how I could add it. Seems like the user itself needs to add it. I'll share my solution to reconnecting the stream here if you feel like adding it. My app is structured a bit differently than yours, but I think you get the idea:
`
import urllib3
twitterStream = Stream(auth, listener(count))
def start_stream(stream, **kwargs):
while True:
try:
stream.filter(**kwargs)
break
except urllib3.exceptions.ReadTimeoutError:
stream.disconnect()
LOG.exception("ReadTimeoutError exception")
time.sleep(5)
start_stream(stream, **kwargs)
except urllib3.exceptions.IncompleteRead:
stream.disconnect()
LOG.exception("Cut off due to app consumes data slower than it is produced")
time.sleep(5)
start_stream(stream, **kwargs)
start_stream(twitterStream, track = 'bitcoin', is_async=True)`