
A delicious hypertable build with aws emr and mapr. Configurable.

Primary LanguageShellMIT LicenseMIT


A delicious hypertable build with aws emr and mapr. Configurable.


1. Set up your build environment:

  • Install AWS CLI
  • Configure AWS CLI with a profile
  • Install Ansible: http://docs.ansible.com/intro_installation.html
  • Set ansible env: shell source ansible env
  • Turn host key checking off: shell export ANSIBLE_HOST_KEY_CHECKING=False
  • Ansible needs Jinja2: apt-get install python-jinja2
  • If old ssh like 5.5: shell export ANSIBLE_SSH_ARGS=""

2. Edit build vars in /group_vars/build

  • profile: usp
  • name: panda
  • type: m1.large
  • masters: 1
  • cores: 1
  • tasks: 3
  • log_uri: s3://pandahard/
  • mapr_edition: m3
  • mapr_version: 3.1.1
  • ami_version: 2.4.2
  • hypertable_version:
  • data: vol-e1141fe5
  • newrelic: 9910108157933df624056a0f5c26f19df6090a28

3. Hypertable Playbooks

Commands are run from app root.

Playbook: Build hypertable

  • Command: ansible-playbook -v -i build build.yml
  • Notes: Build can take 20+ mins depending upon cluster size.

Playbook: Test hypertable build

  • Command: ansible-playbook -v -i builds/{{id}}/{{name}}.hosts test.yml
  • Notes: Creates namespace, loads schema, loads table and then backs it up.

Playbook: Load all data

  • Command: ansible-playbook -v -i builds/{{id}}/{{name}}.hosts load.yml
  • Notes: Uses an ebs attached mount where import scripts and data live. Mount point: /data Path to data: /data/panda/source/$table_name Path to scripts: /data/panda/source/$table_name/load.hql

Playbook: Backup all data

  • Command: ansible-playbook -v -i builds/{{id}}/{{name}}.hosts backup.yml
  • Notes: Uses an ebs attached mount where import scripts and data live. Path: /data/panda/backups

Playbook: Clean all data

  • Command: ansible-playbook -v -i builds/{{id}}/{{name}}.hosts clean.yml

4) Monitoring

  • Hypertable Monitoring @ panda_masters[0].public_ip:15860
  • NewRelic available if you entered a licence or used the one in this repo which is totally ok.


  1. How do I increase debug output when running playbooks? For maximum ansible debug verbosity use "-vvvv". The more v's, the more debug output.

  2. I don't want to use Newrelic? If you don't have a newrelic key or don't want to use newrelic, then comment out the new relic roles in /build.yml on lines 21, 35 & 51.


  1. Support Ganglia Monitoring
  2. Support Hadoop EMR Builds
  3. Needs more tests