/cookbook-elasticsearch

Chef cookbook for ElasticSearch

Primary LanguageRuby

Description

This cookbook installs and configures the elasticsearch search engine/database.

It requires a working Java installation on the target node; add your preferred java cookbook to the node run_list.

The cookbook downloads the elasticsearch tarball from GitHub (via the ark provider), unpacks and moves it to the directory you have specified in the node configuration (/usr/local/elasticsearch by default).

It installs a service which enables you to start, stop, restart and check status of the elasticsearch process.

If your node has the monit recipe in its run_list, it will also create a configuration file for Monit, which will check whether elasticsearch is running, reachable by HTTP and the cluster is in the “green” state.

If you include the elasticsearch::plugin_aws recipe, the AWS Cloud Plugin will be installed on the node, allowing you to use the Amazon AWS features: node auto-discovery and S3 gateway. You may set your AWS credentials either in the “elasticsearch/aws” data bag, or directly in the node configuration.

You may want to include the elasticsearch::proxy_nginx recipe, which will configure Nginx as a reverse proxy for elasticsearch, so you may access it remotely with HTTP Authentication. (Be sure to include a nginx cookbook in your node setup in this case.)

The cookbook also provides a test case in the files/default/tests/minitest/ directory, which can be executed as a part of the Chef run (via the Minitest Chef Handler support). It checks the basic installation mechanics, populates the test_chef_cookbook index with some sample data and performs a simple search.

Usage

Include the elasticsearch recipe in the run_list of a node. Then, upload the cookbook to the Chef server:

    knife cookbook upload elasticsearch

To enable the Amazon AWS related features, include the elasticsearch::plugin_aws recipe. You will need to configure the AWS credentials, bucket names, etc.

You may do that in the node configuration (with knife node edit MYNODE or in the Chef Server console), but it is arguably more convenient to store the information in an "elasticsearch" data bag:

    mkdir -p ./data_bags/elasticsearch
    echo '{ 
      "id" : "aws",
      "discovery" : { "type": "ec2" },

      "gateway" : {
        "type"    : "s3",
        "s3"      : { "bucket": "YOUR BUCKET NAME" }
      },

      "cloud"   : {
        "aws"     : { "access_key": "YOUR ACCESS KEY", "secret_key": "YOUR SECRET ACCESS KEY" },
        "ec2"     : { "security_group": "elasticsearch" }
      }
    }' >> ./data_bags/elasticsearch/aws.json

Do not forget to upload the data bag to the Chef server:

    knife data bag from file elasticsearch aws.json

Usually, you will restrict the access to elasticsearch with firewall rules. However, it's convenient to be able to connect to the elasticsearch cluster from curl or a HTTP client, or to use a management tool such as BigDesk or Paramedic.

To enable authorized access to elasticsearch, you need to include the elasticsearch::proxy_nginx recipe, which will install, configure and run Nginx as a reverse proxy, allowing users with proper credentials to connect.

As with AWS, you may store the usernames and passwords in the node configuration, but also in a data bag item:

    mkdir -p ./data_bags/elasticsearch
    echo '{
      "id" : "users",
      "users" : [
        {"username" : "USERNAME", "password" : "PASSWORD"},
        {"username" : "USERNAME", "password" : "PASSWORD"}
      ]
    }
    ' >> ./data_bags/elasticsearch/users.json

Again, do not forget to upload the data bag to the Chef server:

    knife data bag from file elasticsearch users.json

After you have configured the node and uploaded all the information to the Chef server, run chef-client on the node(s):

    knife ssh name:elasticsearch* 'sudo chef-client'

Testing with Vagrant

The cookbook comes with a Vagrantfile, allowing you to test-drive the installation and configuration with Vagrant, a tool for building virtualized development infrastructures.

First, make sure, you have both VirtualBox and Vagrant installed.

Then, clone this repository into a elasticsearch directory on your development machine:

    git clone git://github.com/karmi/cookbook-elasticsearch.git elasticsearch

Switch to the cloned repository:

   cd elasticsearch

Install the neccessary gems:

   bundle install

You need to download the required third-party cookbooks (unless you already have them in ~/cookbooks).

The easiest way is to use the bundled Berkshelf support:

  berks install --shims ./tmp/cookbooks

Of course, you can install the cookbooks manually as well:

    curl -# -L -k http://s3.amazonaws.com/community-files.opscode.com/cookbook_versions/tarballs/1184/original/apt.tgz   | tar xz -C tmp/cookbooks
    curl -# -L -k http://s3.amazonaws.com/community-files.opscode.com/cookbook_versions/tarballs/1421/original/java.tgz  | tar xz -C tmp/cookbooks
    curl -# -L -k http://s3.amazonaws.com/community-files.opscode.com/cookbook_versions/tarballs/1098/original/vim.tgz   | tar xz -C tmp/cookbooks
    curl -# -L -k http://s3.amazonaws.com/community-files.opscode.com/cookbook_versions/tarballs/1413/original/nginx.tgz | tar xz -C tmp/cookbooks
    curl -# -L -k http://s3.amazonaws.com/community-files.opscode.com/cookbook_versions/tarballs/915/original/monit.tgz  | tar xz -C tmp/cookbooks
    curl -# -L -k http://s3.amazonaws.com/community-files.opscode.com/cookbook_versions/tarballs/1631/original/ark.tgz   | tar xz -C tmp/cookbooks

The Vagrantfile supports three Linux distributions so far:

  • Ubuntu Lucid 32 bit
  • Ubuntu Lucid 64 bit
  • CentOS 6 32 bit

Use the vagrant status command for more information.

We will use the Ubuntu Lucid 64 box for the purpose of this demo. You may want to test-drive this cookbook on a different distribution; check out the available boxes at http://vagrantbox.es.

Launch the virtual machine with Vagrant (it will download the box unless you already have it):

    time vagrant up lucid64

The machine will be started and automatically provisioned with chef-solo.

You'll see Chef debug messages flying by in your terminal, downloading, installing and configuring Java, Nginx, elasticsearch, and all the other components. The process should take about 15 minutes on a reasonable machine and internet connection.

After the process is done, you may connect to elasticsearch via the Nginx proxy from the outside:

    curl 'http://USERNAME:PASSWORD@33.33.33.10:8080/test_chef_cookbook/_search?pretty&q=*'

Of course, you should connect to the box with SSH and check things out:

    vagrant ssh lucid64

    ps aux | grep elasticsearch
    service elasticsearch status --verbose
    curl http://localhost:9200/_cluster/health?pretty

Cookbook Organization

  • attributes/default.rb: version, paths, memory and naming settings for the node
  • attributes/plugin_aws.rb: AWS settings
  • attributes/proxy_nginx.rb: Nginx settings
  • templates/default/elasticsearch.init.erb: service init script
  • templates/default/elasticsearch.yml.erb: main elasticsearch configuration file
  • templates/default/elasticsearch-env.sh.erb: environment variables needed by the Java Virtual Machine and elasticsearch
  • templates/default/elasticsearch_proxy_nginx.conf.erb: the reverse proxy configuration
  • templates/default/elasticsearch.conf.erb: Monit configuration file
  • files/default/tests/minitest: integration tests

License

Author: Karel Minarik (karmi@karmi.cz)

MIT LICENSE