/blender

A modular orchestration engine

Primary LanguageRubyOtherNOASSERTION

Built on Travis

Blender

Blender is a modular remote command execution framework. Blender provides few basic primitives to automate across server workflows. Workflows can be expressed in plain ruby DSL and executed using the CLI.

Following is an example of a simple blender script that will update the package index of three ubuntu servers.

# example.rb
ssh_task 'update' do
  execute 'sudo apt-get update -y'
  members ['ubuntu01', 'ubuntu02', 'ubuntu03']
end

Which can execute it as:

blend -f example.rb

Output:

Run[example.rb] started
 3 job(s) computed using 'Default' strategy
  Job 1 [update on ubuntu01] finished
  Job 2 [update on ubuntu02] finished
  Job 3 [update on ubuntu03] finished
Run finished (42.228923876 s)

An workflow can have multiple tasks, individual tasks can have different members which can be run in parallel.

# example.rb
ssh_task 'update' do
  execute 'sudo apt-get update -y'
  members ['ubuntu01', 'ubuntu02', 'ubuntu03']
end

ssh_task 'install' do
  execute 'sudo apt-get install screen -y'
  members ['ubuntu01', 'ubuntu03']
end

concurrency 2

Output:

Run[blends/example.rb] started
 5 job(s) computed using 'Default' strategy
  Job 1 [update on ubuntu01] finished
  Job 2 [update on ubuntu02] finished
  Job 4 [install on ubuntu01] finished
  Job 3 [update on ubuntu03] finished
  Job 5 [install on ubuntu03] finished
Run finished (4.462043017 s)

Blender provides various types of task execution (like arbitrary ruby code, commands over ssh, serf handlers etc) which can ease automating large cluster maintenance, multi stage provisioning, establishing cross server feedback loops etc.

Installation

Blender is published as pd-blender in rubygems. And you can install it as:

gem install pd-blender

Or declare it as a dependency in your Gemfile, if you are using bundler.

gem 'pd-blender'

Concepts

Blender is composed of two components:

  • Tasks and drivers - Tasks encapsulate commands (or equivalent abstraction). A blender script can have multiple tasks. Tasks are executed using drivers. Tasks can declare their target hosts.

  • Scheduling strategy - Determines the order of task execution across the hosts. Every blender scripts has one and only one scheduling strategy. Scheduling strategies uses the task list as input and produces a list of jobs, to be executed using drivers.

Tasks

Tasks and drivers compliment each other. Tasks act as front end, where we declare what needs to be done, while drivers are used to interpret how those tasks can be done. For example ssh_task can be used to declare tasks, while ssh and ssh_multi driver can execute ssh_tasks. Blender core ships with following tasks and drivers:

  • shell_task: execute commands on current host. shell tasks can only have 'localhost' as its members. presence of any other hosts in members list will raise exception. shell_tasks are executed using shell_out driver. Example:
shell_task 'foo' do
  execute 'sudo apt-get update -y'
end
  • ruby_task: execute ruby blocks against current host. host names from members list is passed to the block. ruby_tasks are executed using Blender::Ruby driver. Example:
ruby_task 'baz' do
  execute do |host|
    puts "Host name is: #{host}"
  end
end
  • ssh_task: execute commands against remote hosts using ssh. Blender ships with two ssh drivers, one based on a vanilla Ruby net-ssh binding, another based on net-ssh-multi (which supports parallel execution) Example:
ssh_task 'bar' do
  execute 'sudo apt-get update -y'
  members ['host1', 'host2']
end
  • scp_task: download or upload files using scp
members ['host1', 'host2', 'host3']
scp_upload '/foo/bar' do
  from '/path/to/remote/file'
end
scp_download '/foo/bar' do
  to '/local/path'
end
  • blend_task: invoke a blender script as a task (nesting)
members ['host1', 'host2', 'host3']
blend_task 'test-task' do
  file '/path/to/remote/file'
  concurrency 10
  strategy :per_host
end

As mentioned earlier tasks are executed using drivers. Tasks can declare their preferred driver or Blender will assign a driver to them automatically. Blender will reuse the global driver if its compatible, else it will create one. By default the global_driver is a shell_out driver. Drivers can expose host concurrency, stdout/stderr streaming and various other customizations, specific to their own implementations.

Scheduling strategies

Scheduling strategies are the most crucial part of a blender script. They decide the order of command execution across distributed nodes in blender. Each blender script is invoked using one strategy. Consider them as a transformation, where the input is tasks and ouput is jobs. Tasks and job are pretty similar in their structures (both holds command and hosts), except a jobs can hold multiple tasks within them. We'll come to this later, but first, let's see how the default strategy work.

  • default strategy: the default strategy takes the list of declared tasks (and associated members in each tasks) breaks them up into per node jobs. For example:
members ['host1', 'host2', 'host3']

ruby_task 'test' do
  execute do |host|
    Blender::Log.info(host)
  end
end

will result in 3 jobs. each with ruby_task[test] on host1, ruby_task[test] on host2 and ruby_task[test] on host3. And then these three tasks will be executed serially. Following will create 6 jobs.

members ['host1', 'host2', 'host3']

ruby_task 'test 1' do
  execute do |host|
    Blender::Log.info("test 1 on #{host}")
  end
end

ruby_task 'test 2' do
  execute do |host|
    Blender::Log.info("test 2 on #{host}")
  end
end

While the next one will create 4 jobs (second task will give only one job).

members ['host1', 'host2', 'host3']

ruby_task 'test 1' do
  execute do |host|
    Blender::Log.info("test 1 on #{host}")
  end
end

ruby_task 'test 2' do
  execute do |host|
    Blender::Log.info("test 2 on #{host}")
  end
  members ['host3']
end

The default strategy is conservative, and allows drivers that work against a single remote host to be integrated with blender. Also this allows the highest level of fine grain job control.

Apart from the default strategy, Blender ships with two more strategy, they are:

  • per task strategy: this creates one job per task. Following example will create 2 jobs, each with three hosts and one of the ruby_task in them.
members ['host1', 'host2', 'host3']

strategy :per_task

ruby_task 'test 1' do
  execute do |host|
    Blender::Log.info("test 1 on #{host}")
  end
end

ruby_task 'test 2' do
  execute do |host|
    Blender::Log.info("test 2 on #{host}")
  end
end

per task strategy allows drivers to optimize individual command execution across multiple hosts. For example ssh_multi driver allows parallel command execution across many hosts. And can be used as:

strategy :per_task
global_driver(:ssh_multi, concurrency: 50)
ssh_task 'run chef' do
  execute 'sudo chef-client --no-fork'
end

Note: if we use the default strategy, ssh_multi driver wont be able to leverage its concurrency features, as the resultant jobs (the driver will receive) will have only one host.

  • per host strategy: it creates one job per host. Following example will create 3 jobs. each with one host and 2 ruby tasks. Thus two tasks will be executed in one host, then on the next one.. follow on. Think of deployments with rolling restart like scenarios. This also allows drivers to optimize multiple tasks/commandsi execution against individual hosts (session reuse etc).
strategy :per_host
members ['host1', 'host2', 'host3']

ruby_task 'test 1' do
  execute do |host|
    Blender::Log.info("test 1 on #{host}")
  end
end
ruby_task 'test 2' do
  execute do |host|
    Blender::Log.info("test 2 on #{host}")
  end
end

Note: this strategy does not work if you have different hosts per tasks.

Its fairly easy to write custom scheduling strategies and they can be used to rewrite or rearrange hosts/tasks as you wish. For example, null strategy that return 0 jobs irrespective of what tasks or members you pass, or a custome strategy that takes the hosts lists of every tasks and considers only one of them dynamically based on some metrics for jobs, etc.

Host discovery

For workflows that depends on dynamic infrastructure, where host names are changing, Blender provides abstractions that facilitate discovering them. blender-chef and blender-serf uses this and allows remote job orchestration for chef or serf managed infrastructure.

Following are some examples:

  • serf: discover hosts using serf membership
require 'blender/serf'

ruby_task 'print host name' do
  execute do |host|
    Blender::Log.info("Host: #{host}")
  end
  members search(:serf, name: '^lt-.*$')
end
  • chef: discover hosts using Chef search
require 'blender/discoveries/chef'

ruby_task 'print host name' do
  execute do |host|
    Blender::Log.info("Host: #{host}")
  end
  members search(:chef, 'roles:web')
end

Invoking blender periodially with Rufus scheduler

Blender is designed to be used as a standalone script that can be invoked on-demand or consumed as a library, i.e. workflows are written in plain Ruby objects and invoked from other tools or application. Apart from these, Blender can be use for periodic job execution also. Underneath it uses Rufus::Scheduler to trigger Blender run, after a fixed interval (can be expressed via cron syntax as well, thanks to Rufus).

Following will run example.rb blender script after every 4 hours.

schedule '/path/to/example.rb' do
  cron '* */4 * * *'
end

Ignore failure

Blender will fail the execution immediately if any of the job fails. ignore_failure attribute can be used to proceed execution even after failure. This can be declared both per task level as well as globally.

shell_task 'fail' do
  command 'ls /does/not/exists'
  ignore_failure true
end
shell_task 'will be executed' do
  command 'echo "Thrust is what we need"'
end

Event handlers

Blender provides an event dispatchment facility (inspired from Chef), where arbitrary logic can be hooked into the event system (e.g. HipChat notification handlers, statsd handlers, etc) and blender will automatically invoke them during key events. As of now, events are available before and after run and per job execution. Event dispatch system is likely to get more elaborate and blender might have few common event handlers (metric, notifications etc) in near future.

Ancillary projects

Blender has a few ancillary projects for integration with other systems, following are few of them:

  • Zookeeper based locking for distributed blender deployments blender-zk
  • Serf based host discovery and command dispatch blender-serf
  • Chef based host discovery blender-chef

Supported ruby versions

Blender currently support the following Ruby implementations:

  • Ruby 1.9.3
  • Ruby 2.1.0
  • Ruby 2.1.2

License

Apache 2

Contributing

  1. Fork it ( https://github.com/PagerDuty/blender/fork )
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create a new Pull Request