Easily launch a cluster on Amazon EC2 configured with dask.distributed
,
Jupyter Notebooks, and Anaconda.
You also install dask-ec2
using pip:
$ pip install dask-ec2
You can also install dask-ec2
and its dependencies from the
conda-forge repository using
conda:
$ conda install dask-ec2 -c conda-forge
Note: dask-ec2
uses
boto3
to interact with
Amazon EC2. You can configure your AWS credentials using
Environment Variables
or Configuration Files.
The dask-ec2 up
command can be used to create and provision a cluster on Amazon EC2:
$ dask-ec2 up --help
Usage: dask-ec2 up [OPTIONS]
Options:
--keyname TEXT Keyname on EC2 console [required]
--keypair PATH Path to the keypair that matches the keyname
[required]
--name TEXT Tag name on EC2
--tags TEXT Additional EC2 tags. Comma separated K:V
pairs: K1:V1,K2:V2
--region-name TEXT AWS region [default: us-east-1]
--vpc-id TEXT EC2 VPC ID
--subnet-id TEXT EC2 Subnet ID on the VPC
--iaminstance-name TEXT IAM Instance Name
--ami TEXT EC2 AMI [default: ami-d05e75b8]
--username TEXT User to SSH to the AMI [default: ubuntu]
--type TEXT EC2 Instance Type [default: m3.2xlarge]
--count INTEGER Number of nodes [default: 4]
--security-group TEXT Security Group Name [default: dask-ec2-default]
--security-group-id TEXT Security Group ID (overwrites Security Group
Name)
--volume-type TEXT Root volume type [default: gp2]
--volume-size INTEGER Root volume size (GB) [default: 500]
--file PATH File to save the metadata [default:
cluster.yaml]
--provision / --no-provision Provision salt on the nodes [default: True]
--anaconda / --no-anaconda Bootstrap anaconda [default: True]
--dask / --no-dask Install Dask.Distributed in the cluster
[default: True]
--notebook / --no-notebook Start a Jupyter Notebook in the head node
[default: True]
--nprocs INTEGER Number of processes per worker [default: 1]
--source / --no-source Install Dask/Distributed from git master
[default: False]
-h, --help Show this message and exit.
The minimal required arguments for the dask-ec2 up
command are:
$ dask-ec2 up --keyname my_aws_key --keypair ~/.ssh/my_aws_key.pem
This will create a cluster.yaml
in the directory that it was executed, and
this file is required to use the other commands in the CLI.
Once a cluster is running, the dask-ec2
command can be used to create or destroy
a cluster, ssh into nodes, or other functions:
$ dask-ec2
Usage: dask-ec2 [OPTIONS] COMMAND [ARGS]...
Options:
--version Show the version and exit.
-h, --help Show this message and exit.
Commands:
anaconda Provision anaconda
dask-distributed dask.distributed option
destroy Destroy cluster
notebook Provision the Jupyter notebook
provision Provision salt instances
ssh SSH to one of the node. 0-index
up Launch instances