/TweetAndGo

A web scraper that persists tweets to a database, to be used as a base for other projects.

Primary LanguagePHP

TweetAndGo

  • An archival program to process Twitter tweets with aims to explore semantic analysis and chatbot concepts.
  • Partially circumvented twitter API rate and volume limits to fetch and persist up to 3000 tweets

Features

  • avoids spamming HTTP requests with randomized delays.
  • uses Doctrine ORM to facilitate database interactions.

Demo

  • The current implementation is able to parse up to ~3000 tweets for any account in <10 minutes. Currently, Twitter hard caps my methodology to this ~3000 figure. This can be sped up. alt text
  • Here is a quick view of this program's results. alt text

TODOs

  • Ensure adherence to PSR code conventions: PSR1-4?
  • Refactor to use Laravel; rid of Doctrine.
  • Refactor using the web scraper design pattern.
  • Fix filepath strings and settings such that the code can be deployed easily.
  • Implement and deploy automated testing to ensure algorithm always works.
  • Automate fetching of accounts of interest.
  • The end goal of this project is to explore text semantic analysis and chatbot concepts I have in mind.