tweet-collector: A Python repository from sjwhitmore

Tweet Collector

Initial stages -- just collects tweets pertaining to a certain topic and stores them in a MongoDB.

Next steps -- do some cool analysis.

Run by entering "node testnodetwitter.js" on the command line.

Remove words beginning with "@"" (mentions) and URLs, delete "#" from hashtags
Use emoticon dict to link emoticons with various levels of sentiment (http://en.wikipedia.org/wiki/List of emoticons)
Use abbreviation dict to replace words like "lol" and "gr8" with their written out versions (http://noslang.com)
Filter out "stop words" (those commonly ignored by search engines) (http://www.webconfs.com/stop-words.php)
Replace words with repeating character sequences with 3 charactes: i.e. "coooooool" to "coool" to standardize yet also retain emphasis.
Link negation words with the words they follow... i.e "isn't good" should mean that "good" is replaced by "NOT_good"
run the

sjwhitmore/tweet-collector