Scrapping Specific Language

Question

Scrapping Specific Language

6wom9 opened this issue 2 years ago · 3 comments

Hello,

Thanks for your sharing. It would be very helpful to understand the process of scrapping and analyzing the tweets.

I want to scrape tweets with a specific language. How do I do this modification? I tried to do something but I could not overcome it.

Thanks for your help in advance.

Answer 1 · 2022-11-06T19:29:35.000Z

Hi,

Snscrape has a language query you can implement for scrapping specific languages.

You can check out this link https://stackoverflow.com/questions/72691870/how-to-scrape-twitter-tweets-of-a-specific-language-in-python

I hope this helps.

Answer 2 · 2022-11-07T18:17:38.000Z

Thanks for your reply.

`# Setting variables to be used below
maxTweets = 50
language=['en']

Creating list to append tweet data to

tweets_list = []

Using TwitterSearchScraper to scrape data and append tweets to list

for i,tweet in enumerate(sntwitter.TwitterSearchScraper('fantoken').get_items()):

if i>maxTweets:
    break
if tweet.lang in language:
    tweets_list.append([tweet.date, tweet.id, tweet.content, tweet.lang, tweet.likeCount, tweet.replyCount, tweet.retweetCount, tweet.hashtags])`

I've modified the code as you see. However, something goes wrong. Crawled tweet amounts do not reach the amount of MaxTweets. I think I put wrong place the language section of the code.

I hope you can give me an advice.

Answer 3 · 2022-11-08T12:13:01.000Z

Okay, you can specify the language as a part of the input parameters of the TwitterSearchScraper()
The link here should be helpful:
[https://github.com/JustAnotherArchivist/snscrape/issues/164]