/linode-cluster-toolkit

Library and tool to provision secure clusters easily on Linode cloud

Primary LanguagePythonMIT LicenseMIT

Note: This project is still under active feature development and a production-ready release is expected around July 26 2017. Until then, please expect bugs and use caution while using it.

Linode Cluster Toolkit

The Linode Cluster Toolkit project's goal is to make provisioning and configuration of large secure clusters on Linode cloud simple for users and applications.

The project consists of two main components:

  • Linode Cluster Toolkit or LCT

    LCT is a Python library that provides interfaces for provisioning, configuring and querying clusters. See the Architecture diagram for a list of interfaces it supports.

    LCT's interfaces and functionality are designed to be useful for a wide spectrum of client applications - from simple command-line tools and scripts to multi-tenant SaaS systems and web applications.

    This is achieved by providing multiple implementations for every service - while one implementation can be extremely simple and suitable for a single user to use on their personal computer, another implementation may integrate with complex software which provide production grade services making it suitable for large multi-tenant web applications and SaaS systems.

    LCT uses and integrates with Linode's v4 and v3 APIs, StackScripts, optionally with well-known tools like cloud-init and Ansible, purpose-built software like HashiCorp Vault for secrets management and Celery for task queuing, and SQL databases for cluster information storage and querying.

    It supports both Python 3 and Python 2 environments.

  • LinodeTool

    LinodeTool is a command-line tool that uses LCT to provision and configure clusters and single nodes.

  • Cluster Provisioning
    • all cluster resources and configurations are described in Cluster Plans
    • cross-region clusters
    • can provision Linodes, NodeBalancers, Disks, Block Stores
    • private cloud [Under implementation]
    • create clusters from cluster plans using Ansible module [Planned]
    • create clusters from shell scripts using shell scripts wrapper [Planned]
  • Cluster Configuration
    • oriented towards making big data deployments on Linode easy
    • configure using cloud-init [Under implementation]
    • configure using Ansible module
    • configure using StackScripts
    • Hostnames [Under implementation]
    • dynamic firewall configuration across multiple nodes based on deployed software [Planned]
    • advanced DNS topologies, split-horizon DNS provisioning [Planned]
    • bundled cluster plan templates for big data stacks [Planned]
  • Security
    • secure by default configurations for all provisioned nodes
    • all nodes configured with tight firewall rules - drop all incoming and outgoing traffic by default (except SSH)
    • all nodes have SSH password authentication disabled
    • integrate with secrets management providers like HashiCorp Vault [Under implementation]
  • Cluster Operations
    • clusters are treated first-level concepts
    • start cluster, stop cluster
    • support cluster orchestration (such as shutting down in particular order) [Under implementation]
  • Inventory Operations
    • cluster state and node information are persisted to storage backends
    • support for multiple storage backends
    • tagging and querying support
  • Single Node Operations
    • one-liners to create and delete node

Both the toolkit library and LinodeTool are part of the same Python package.

Python 3 or Python 2 should be installed on the machine where the toolkit or linodetool are executed. Python is usually already installed in most modern Linux distributions.

Until this package is published to PyPI, install it using pip to pull from this GitHub repo:

# Python 3
$ pip3 install git+https://github.com/pathbreak/linode-cluster-toolkit.git

# Python 2
$ pip install git+https://github.com/pathbreak/linode-cluster-toolkit.git

LCT does not install any of the other 3rd party software it's capable of integrating with. Depending on your particular application's requirements or depending on the infrastructure already available to you, you can optionally install one or more of the following software that LCT is capable of integrating with:

  • HashiCorp Vault for enterprise grade secrets management

    See Install Vault for installation procedure.

  • Celery for distributed task execution

    Cluster creation can be a time consuming task. LCT can integrate with Celery's concurrent task execution capabilities to make the process faster, perform retries with exponential back-offs in case of failures, and store a list of failed tasks for later retries.

    See Install Celery for installation.

  • A database for cluster inventory and state storage, and querying

    LCT can integrate with any of the following databases:

An important concept of LCT project is a Cluster Plan. A Cluster Plan is a description of all the nodes, nodebalancers, other resources and configurations to apply to them.

See Cluster Plans for examples and details of cluster plans.

The snippet below creates a simple cluster plan consisting of just 2 nodes in 1 region.

from lct import Toolkit, ToolkitContext
from lct.clusters.clusterplan import ClusterPlan

# Create a toolkit configuration to configure the
# providers the toolkit uses for providing its services.
# An empty configuration makes the toolkit select the simplest behavior
# for all services - secrets are handled by the simple secrets provider,
# cluster state and inventories are stored to local filesystem as JSON files
# via TinyDB, tasks are executed by a simple sequential or multithreaded
# queue.
tkconf = {}
tk = Toolkit(tkconf)

tk.initialize()

# Create a ToolkitContext to specify the application and customer context
# for any cluster operaiton. This is primarily stored as the context for
# storing cluster state and inventory information.
tkctx = ToolkitContext('testapp', 'me')

# Specify a cluster plan. This can be a simple dict or loaded from a YAML or JSON file.
plandict = {
    'name' : 'testcluster',
    'regions': [
        {
            'region' : 'us-east-1a',
            'nodes' : [
                {
                    'name': 'nodeplan1',
                    'type': 'Linode 1024',
                    'count': 2,
                    'distribution' : 'linode/ubuntu16.04lts'
                }
            ]
        }
    ]
}
plan = ClusterPlan(plandict)

# Create the cluster.
tk.cluster_service().create_cluster(tkctx, plan, 'My First Cluster', 'mycluster1')
$ linodetool cluster create 'ha-wordpress' ha-wordpress-plan.yaml

Creation of a secure node is as simple as:

$ linodetool node create newark '1gb' 'ubuntu 16.04 lts'

But before that can work, LinodeTool requires a one-time entry of two pieces of credentials:

  • A personal access token to use Linode's API

    You can obtain a personal access token by logging into https://cloud.linode.com with your Linode username and password, navigating to My Profile > Integrations > Personal Access Tokens > Create a Personal Access Token, setting Linodes access to one of Create/Modify/Delete, and press Create.

    The web application displays a personal access token. Copy that and store it in LinodeTool's secrets storage using this command:

    $ linodetool secret set personal-token <YOUR PERSONAL ACCESS TOKEN>

    Note that LinodeTool's default secrets store is an unencrypted insecure one. If you want to store more securely, create a toolkit configuration and specify a more secure secrets provider.

  • An SSH public key.

    If you don't have a SSH public key (usually named as ~/.ssh/id_rsa.pub, create one:

    $ ssh-keygen -t rsa -b 4096 -f ~/.ssh/id_rsa -N ''

    Then add it to LinodeTool's secrets store:

    $ linodetool secret set default-root-ssh-public-key ~/.ssh/id_rsa

Two example cluster plans for large clusters:

  1. https://gist.github.com/pathbreak/59c638db0fd95c84c0f655df145ba0ac

    This is a cluster plan for a cross-region, highly-available, disaster-recoverable 82-node WordPress setup involving Apache web servers with WordPress, Memcached, MySQL cluster with NDB, Block Stores and NodeBalancers.

  2. https://gist.github.com/pathbreak/eb7242a48024b54101b432049116ae7e

    This is a cluster plan for a 52-node big data IoT system involving Spark Streaming, Kafka input pipelines in multiple regions, a PostgreSQL cluster, high memory instances and block stores.

More details about cluster plans are in the subsections below.

TODO

TODO

<TODO describe Toolkit, ToolkitConfiguration and ToolkitContext cardinalities with examples, such
as how to share the same database or same task queues, etc>

https://github.com/pathbreak/linode-cluster-toolkit/blob/master/docs/images/toolkit_architecture.png

  • The Toolkit class should be your starting point.
  • Toolkit provides a number of *_service() methods that return an appropriate *Service instance. For example, ClusterService provides cluster management services. InventoryService provides inventory storage and querying services.