- vagrant
- virtualbox
- ruby/gem
- librarian
- git
This environment requires vagrant and libarian to be installed so that the proper cookbooks can be downloaded.
- install
virtualbox
. - install
vagrant
. - install
ruby
andgem
gem install librarian
The vagrant up
command will bring up the environment and install
all the necessary software including the Python interpreter, luigi,
the Java interpreter, and the CDH4 distribution of hadoop.
Be sure to give your VM at least 1GB of memory so that memory isn't exhausted.
git clone git@github.com:IanLewis/luigi-demo.git
cd luigi-demo
librarian-chef install
vagrant up
The install can take quite a while so I would get a cup of coffee after running vagrant for the first time.
After you get the VM up and running you should be able to view the luigi visualization tool at http://192.168.30.10:8082/
You can update the environment once it's been installed. Generally you won't have to do this but if you need to make changes to the provided cookbooks you can run these commands to update your environment.
cd luigi-demo
librarian-chef update
vagrant provision
First log into your VM using vagrant ssh
. You can then run the luigi demos
from the /var/vagrant_data/
directory. The following example can be run
without hadoop and simply counts the words in a file.
$ cd /var/vagrant_data/
$ python bin/top_artists.py WordCount --date-interval 2013-10-29.txt --local-scheduler
You can run an example hadoop job like this:
$ cd /var/vagrant_data/
$ python bin/top_artists.py Top10Artists --date-interval 2012-08 --workers 3 --use_hadoop