ckoepp/TwitterSearch

Search strings with special/punctuation characters cause unexpected exceptions

nkartashov opened this issue · 2 comments

While using your lib, I've run into the issue that ValueError is thrown if certain characters such as '(', ')', '[', ']', '$', '?', "'" (apostrophe) and TwitterSearch.TwitterSearchException.TwitterSearchException: Error 401: Unauthorized is produced when I use 'test=', 'test=foo' (basically anytime when I use '=' character). Code producing the aforementioned exceptions (CONSUMER_KEY, CONSUMER_SECRET, TOKEN_KEY, TOKEN_SECRET are keys specific to my application and are working):

Python 2.7.5, TwitterSearch 0.78.3

import logging
import traceback
import TwitterSearch

def download_tweets(search_string, language):
"""Returns list of tweets containing <search_string>, should be like 'en' or 'ru' """

tso = TwitterSearch.TwitterSearchOrder()
tso.addKeyword(search_string)
tso.setLanguage(language)
tso.setIncludeEntities(False)

# create a TwitterSearch object with our secret tokens
ts = TwitterSearch.TwitterSearch(
    consumer_key=CONSUMER_KEY,
    consumer_secret=CONSUMER_SECRET,
    access_token=TOKEN_KEY,
    access_token_secret=TOKEN_SECRET
)
try:
    return ts.searchTweetsIterable(tso)

except TwitterSearch.TwitterSearchException as e:
    logging.exception("%s: %s", e.code, e.message)
    logging.exception("Stack trace: %s", traceback.format_exc())
    raise e

download_tweets("test=", "en")
download_tweets("test=foo", "en")
download_tweets("test'", "en")
download_tweets("test$", "en")
download_tweets("test?", "en")
download_tweets("test(", "en")
download_tweets("test)", "en")
download_tweets("test[", "en")
download_tweets("test]", "en")

Looks like there is something wrong with the url encoding as special chars should be encoded before querying the API otherwise Twitter may respond with such errors.

I'll have a look at this issue within the next days. Thanks for reporting this issue!

As a quick and dirty solution: use url encoded strings by yourself by calling download_tweets("test%28","en") instead of download_tweets("test(","en").

I just looked at the source quickly and I forgot to apply quote_plus() at line 65 in TwitterSeachOrder.py. Shame on me :p

Will update it and write a test-case to not let foolish stuff like this happen again...