/pcrawler

Persian Twitter crawler

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

PCrawler

This is a fork of the awesome Trenditter project, modified to crawl and preserve tweets in Persian.

TODO

  • Better language identification to differentate between Persian and Arabic tweets, because at the moment, the Twitter language detection cannot with a high accuracy differentiate between Persian and Arabic tweets.
  • Find Twitter bots and remove contributions (tweets, retweets) from them.