Tenvolin/TweetAndGo

A web scraper that persists tweets to a database, to be used as a base for other projects.

PHP

TweetAndGo

An archival program to process Twitter tweets with aims to explore semantic analysis and chatbot concepts.
Partially circumvented twitter API rate and volume limits to fetch and persist up to 3000 tweets

Features

avoids spamming HTTP requests with randomized delays.
uses Doctrine ORM to facilitate database interactions.

Demo

The current implementation is able to parse up to ~3000 tweets for any account in <10 minutes. Currently, Twitter hard caps my methodology to this ~3000 figure. This can be sped up.
Here is a quick view of this program's results.

TODOs

Ensure adherence to PSR code conventions: PSR1-4?
Refactor to use Laravel; rid of Doctrine.
Refactor using the web scraper design pattern.
Fix filepath strings and settings such that the code can be deployed easily.
Implement and deploy automated testing to ensure algorithm always works.
Automate fetching of accounts of interest.
The end goal of this project is to explore text semantic analysis and chatbot concepts I have in mind.