Scrapping Specific Language
6wom9 opened this issue · 3 comments
Hello,
Thanks for your sharing. It would be very helpful to understand the process of scrapping and analyzing the tweets.
I want to scrape tweets with a specific language. How do I do this modification? I tried to do something but I could not overcome it.
Thanks for your help in advance.
Hi,
Snscrape has a language query you can implement for scrapping specific languages.
You can check out this link https://stackoverflow.com/questions/72691870/how-to-scrape-twitter-tweets-of-a-specific-language-in-python
I hope this helps.
Thanks for your reply.
`# Setting variables to be used below
maxTweets = 50
language=['en']
Creating list to append tweet data to
tweets_list = []
Using TwitterSearchScraper to scrape data and append tweets to list
for i,tweet in enumerate(sntwitter.TwitterSearchScraper('fantoken').get_items()):
if i>maxTweets:
break
if tweet.lang in language:
tweets_list.append([tweet.date, tweet.id, tweet.content, tweet.lang, tweet.likeCount, tweet.replyCount, tweet.retweetCount, tweet.hashtags])`
I've modified the code as you see. However, something goes wrong. Crawled tweet amounts do not reach the amount of MaxTweets. I think I put wrong place the language section of the code.
I hope you can give me an advice.
Okay, you can specify the language as a part of the input parameters of the TwitterSearchScraper()
The link here should be helpful:
[https://github.com/JustAnotherArchivist/snscrape/issues/164]