/tweitgeist

realtime Twitter trending hashtags computation using RedStorm / Storm

Primary LanguageRuby

Tweitgeist v1.2.0

Tweitgeist analyses the Twitter Spitzer hose and compute in realtime the top trending hashtags using RedStorm/Storm. What makes this interesting other than being a cool Storm example, is the fact that this architecture will work at full Twitter Firehose scale without much modifications.

There are three components:

  • The Twitter Spitzer stream reader which pushes messages in a Redis queue
  • The Redstorm analyser which read the Twitter stream queue, computes the trending hashtags and output the top N list every 5 seconds in a Redis queue
  • The viewer UI for the visualization

Dependencies

This has been tested on OSX 10.6+, Linux 11.10 & 12.04 using JRuby 1.6.x for the RedStorm topology and Ruby 1.9.x for the Twitter Spitzer hose reader.

Installation

  • Redis is required
  • RVM is highly recommended as you will need to work with both Ruby/JRuby and different gemsets.

Redstorm backend

  • requires JRuby 1.6.x

  • set JRuby in 1.9 mode by default

    export JRUBY_OPTS=--1.9
  • install the RedStorm gem using bundler with the supplied Gemfile

    $ bundle install
  • run RedStorm installation

    $ bundle exec redstorm install
  • package the topology required gems

    $ bundle exec redstorm bundle topology
  • if you plan on running the topology on a cluster, package the topology jar

    bundle exec redstorm jar lib/tweitgeist/

Twitter Spitzer stream reader

  • requires Ruby 1.9.x

  • install required gems using bundler with the supplied Gemfile

    $ bundle install

Viewer

  • requires Node.js

    $ sudo apt-get install nodejs
  • requires npm

    $ sudo apt-get install npm
  • install CoffeeScript if you want to modify the Node.js server

    $ npm install -g coffee-script
  • install other dependencies

    $ cd lib/viewer
    $ npm install .

Usage overview

Redstorm backend

  • requires JRuby 1.6.x

  • set JRuby in 1.9 mode by default

    export JRUBY_OPTS=--1.9

RedStorm backend in local mode.

$ bundle exec redstorm local lib/tweitgeist/storm/tweitgeist_topology.rb

RedStorm backend in remote cluster mode.

$  bundle exec redstorm cluster lib/tweitgeist/storm/tweitgeist_topology.rb

Twitter Spitzer stream reader

  • requires Ruby 1.9.x

  • edit config/twitter_reader.rb to add your credentials

$ ruby lib/tweitgeist/twitter/twitter_reader.rb

Viewer

$ coffee server.coffee --port 8080 --host 127.0.0.1 --redis-port 6379 --redis-host 127.0.0.1

or (with simulated data in case of no redis)

$ coffee server.coffee --port 8080 --host 127.0.0.1 --mock

Author

Colin Surprenant, @colinsurprenant, https://github.com/colinsurprenant, colin.surprenant@gmail.com

Contributors

Francois Lafortune, @quickredfox, https://github.com/quickredfox, code@quickredfox.at

Nicholas Brochu, @nbrochu, https://github.com/nbrochu, info@nicholasbrochu.com

License

Tweitgeist is distributed under the Apache License, Version 2.0.