robpearce-flux/terraform-aws-gitlab-runner

Terraform module for GitLab auto scaling runners on AWS spot instances

"Added support for runners.machine.autoscaling parameters which replaces all depcreated off peak settings. In case you use any of the the variables off_peak_* please upgrade. The default example contains an example.

"Added support to download docker machine from a different location, e.g. https://gitlab.com/gitlab-org/ci-cd/docker-machine"

Terraform versions

Terraform 0.12

Module is available as Terraform 0.12 module, pin to version 4.x. Please submit pull-requests to the develop branch.

Migration from 0.11 to 0.12 is tested for the runner-default example. To migrate the runner, execute the following steps.

Update to Terraform 0.12
Migrate your Terraform code via Terraform terraform 0.12upgrade.
Update the module from 3.10.0 to 4.0.0, next run terraform init
Run terraform apply. This should trigger only a re-creation of the the auto launch configuration and a minor change in the auto-scaling group.

Terraform 0.11

Module is available as Terraform 0.11 module, pin module to version 3.x. Please submit pull-requests to the terraform011 branch.

The module

This Terraform modules creates a GitLab CI runner. A blog post describes the original version of the the runner. See the post at 040code. The original setup of the module is based on the blog post: Auto scale GitLab CI runners and save 90% on EC2 costs.

The runners created by the module using by default spot instances for running the builds using the docker+machine executor.

Shared cache in S3 with life cycle management to clear objects after x days.
Logs streamed to CloudWatch.
Runner agents registered automatically.

The name of the runner agent and runner is set with the overrides variable. Adding an agent runner name tag does not work.

...
overrides  = {
  name_sg                     = ""
  name_runner_agent_instance  = "Gitlab Runner Agent"
  name_docker_machine_runners = "Gitlab Runner Terraform"
}

//this doesn't work
agent_tags = merge(local.my_tags, map("Name", "Gitlab Runner Agent"))

The runner support 3 main scenario's:

GitLab CI docker-machine runner - one runner agent

In this scenario the runner agent is running on a single EC2 node and runners are created by docker machine using spot instances. Runners will scale automatically based on configuration. The module creates by default a S3 cache that is shared cross runners (spot instances).

GitLab CI docker-machine runner - multiple runner agents

In this scenario the multiple runner agents can be created with different configuration by instantiating the module multiple times. Runners will scale automatically based on configuration. The S3 cache can be shared cross runners by managing the cache outside the module.

GitLab Ci docker runner

In this scenario not docker machine is used but docker to schedule the builds. Builds will run on the same EC2 instance as the agent. No auto scaling is supported.

Prerequisites

Terraform

Ensure you have Terraform installed the modules is based on Terraform 0.11, see .terraform-version for the used version. A handy tool to mange your Terraform version is tfenv.

On macOS it is simple to install tfenv using brew.

brew install tfenv

Next install a Terraform version.

tfenv install <version>

AWS

Ensure you have setup your AWS credentials. The module requires access to IAM, EC2, CloudWatch, S3 and SSM.

JQ & AWS CLI

In order to be able to destroy the module, you will need to run from a host with both jq and aws installed and accessible in the environment.

On macOS it is simple to install them using brew.

brew install jq awscli

Service linked roles

The GitLab runner EC2 instance requires the following service linked roles:

AWSServiceRoleForAutoScaling
AWSServiceRoleForEC2Spot

By default the EC2 instance is allowed to create the required roles, but this can be disabled by setting the option allow_iam_service_linked_role_creation to false. If disabled you must ensure the roles exist. You can create them manually or via Terraform.

resource "aws_iam_service_linked_role" "spot" {
  aws_service_name = "spot.amazonaws.com"
}

resource "aws_iam_service_linked_role" "autoscaling" {
  aws_service_name = "autoscaling.amazonaws.com"
}

GitLab runner token configuration

By default the runner is registered on initial deployment. In previous versions of this module this was a manual process. The manual process is still supported but will be removed in future releases. The runner token will be stored in the AWS SSM parameter store. See example for more details.

To register the runner automatically set the variable gitlab_runner_registration_config["token"]. This token value can be found in your GitLab project, group, or global settings. For a generic runner you can find the token in the admin section. By default the runner will be locked to the target project, not run untagged. Below is an example of the configuration map.

gitlab_runner_registration_config = {
  registration_token = "<registration token>"
  tag_list           = "<your tags, comma separated>"
  description        = "<some description>"
  locked_to_project  = "true"
  run_untagged       = "false"
  maximum_timeout    = "3600"
  access_level       = "<not_protected OR ref_protected, ref_protected runner will only run on pipelines triggered on protected branches. Defaults to not_protected>"
}

For migration to the new setup simply add the runner token to the parameter store. Once the runner is started it will lookup the required values via the parameter store. If the value is null a new runner will be registered and a new token created/stored.

# set the following variables, look up the variables in your Terraform config.
# see your Terraform variables to fill in the vars below.
aws-region=<${var.aws_region}>
token=<runner-token-see-your-gitlab-runner>
parameter-name=<${var.environment}>-<${var.secure_parameter_store_runner_token_key}>

aws ssm put-parameter --overwrite --type SecureString  --name "${parameter-name}" --value ${token} --region "${aws-region}"

Once you have created the parameter, you must remove the variable runners_token from your config. The next time your gitlab runner instance is created it will look up the token from the SSM parameter store.

Finally, the runner still supports the manual runner creation. No changes are required. Please keep in mind that this setup will be removed in future releases.

Access runner instance

A few option are provide the runner instance

Provide a public ssh key to access the runner by setting ``.
Provide a EC2 key pair to access the runner by setting ``.
Access via the Session Manager (SSM) by setting enable_runner_ssm_access to true. The policy to allow access via SSM is not very restrictive.
By setting non of the above no keys or extra policies will be attached to the instance. You can still configure you own policies by attaching them to runner_agent_role_arn.

GitLab runner cache

By default the module creates a a cache for the runner in S3. Old objects are automatically remove via a configurable life cycle policy on the bucket.

Creation of the bucket can be disabled and managed outside this module. A good use case is for sharing the cache cross multiple runners. For this purpose the cache is implemented as sub module. For more details see the cache module. An example implementation of this use case can be find in the runner-public example.

Usage

Configuration

Update the variables in terraform.tfvars according to your needs and add the following variables. See the previous step for instructions on how to obtain the token.

runner_name  = "NAME_OF_YOUR_RUNNER"
gitlab_url   = "GITLAB_URL"
runner_token = "RUNNER_TOKEN"

The base image used to host the GitLab Runner agent is the latest available Amazon Linux 2 HVM EBS AMI. In previous versions of this module a hard coded list of AMIs per region was provided. This list has been replaced by a search filter to find the latest AMI. Setting the filter to amzn2-ami-hvm-2.0.20200207.1-x86_64-ebs will allow you to version lock the target AMI.

Usage module

Below a basic examples of usages of the module. The dependencies such as a VPC, and SSH keys have a look at the default example.

module "runner" {
  # https://registry.terraform.io/modules/npalm/gitlab-runner/aws/
  source  = "npalm/gitlab-runner/aws"

  aws_region  = "eu-west-1"
  environment = "spot-runners"

  ssh_public_key = local_file.public_ssh_key.content

  vpc_id                   = module.vpc.vpc_id
  subnet_ids_gitlab_runner = module.vpc.private_subnets
  subnet_id_runners        = element(module.vpc.private_subnets, 0)

  runners_name       = "docker-default"
  runners_gitlab_url = "https://gitlab.com"

  gitlab_runner_registration_config = {
    registration_token = "my-token
    tag_list           = "docker"
    description        = "runner default"
    locked_to_project  = "true"
    run_untagged       = "false"
    maximum_timeout    = "3600"
  }

}

Examples

A few examples are provided. Use the following steps to deploy. Ensure your AWS and Terraform environment is set up correctly. All commands below should be run from the terraform-aws-gitlab-runner/examples/<example-dir> directory.

SSH keys

SSH keys are generated by Terraform and stored in the generated directory of each example directory.

Versions

THe version of Terraform is locked down via tfenv, see the .terraform-version file for the expected versions. Providers are locked down as will in the providers.tf file.

Configure

The examples are configured with defaults that should wrk in general. THe samples are in general configured for the region Ireland eu-west-1. The only parameter that needs to be provided is the GitLab registration token. The token can be find in GitLab in the runner section (global, group or repo scope). Create a file terrafrom.tfvars and the registration token.

registration_token = "MY_TOKEN"

Run

Run terraform init to initialize Terraform. Next you can run terraform plan to inspect the resources that will be created.

To create the runner run:

terraform apply

To destroy runner:

terraform destroy

Requirements

Name	Version
terraform	>= 0.12

Providers

Name	Version
aws	n/a
null	n/a

Inputs

Name	Description	Type	Default	Required
agent_tags	Map of tags that will be added to agent EC2 instances.	`map(string)`	`{}`	no
allow_iam_service_linked_role_creation	Boolean used to control attaching the policy to a runner instance to create service linked roles.	`bool`	`true`	no
ami_filter	List of maps used to create the AMI filter for the Gitlab runner agent AMI. Must resolve to an Amazon Linux 1 or 2 image.	`map(list(string))`	{ "name": [ "amzn2-ami-hvm-2.*-x86_64-ebs" ] }	no
ami_owners	The list of owners used to select the AMI of Gitlab runner agent instances.	`list(string)`	[ "amazon" ]	no
arn_format	ARN format to be used. May be changed to support deployment in GovCloud/China regions.	`string`	`"arn:aws"`	no
aws_region	AWS region.	`string`	n/a	yes
aws_zone	Deprecated. Will be removed in the next major release.	`string`	`"a"`	no
cache_bucket	Configuration to control the creation of the cache bucket. By default the bucket will be created and used as shared cache. To use the same cache across multiple runners disable the creation of the cache and provide a policy and bucket name. See the public runner example for more details.	`map`	{ "bucket": "", "create": true, "policy": "" }	no
cache_bucket_name_include_account_id	Boolean to add current account ID to cache bucket name.	`bool`	`true`	no
cache_bucket_prefix	Prefix for s3 cache bucket name.	`string`	`""`	no
cache_bucket_versioning	Boolean used to enable versioning on the cache bucket, false by default.	`bool`	`false`	no
cache_expiration_days	Number of days before cache objects expires.	`number`	`1`	no
cache_shared	Enables cache sharing between runners, false by default.	`bool`	`false`	no
cloudwatch_logging_retention_in_days	Retention for cloudwatch logs. Defaults to unlimited	`number`	`0`	no
docker_machine_download_url	Full url pointing to a linux x64 distribution of docker machine. Once set `docker_machine_version` will be ingored. For example the GitLab version, https://gitlab-docker-machine-downloads.s3.amazonaws.com/v0.16.2-gitlab.2/docker-machine.	`string`	`""`	no
docker_machine_instance_type	Instance type used for the instances hosting docker-machine.	`string`	`"m5.large"`	no
docker_machine_options	List of additional options for the docker machine config. Each element of this list must be a key=value pair. E.g. '["amazonec2-zone=a"]'	`list(string)`	`[]`	no
docker_machine_role_json	Docker machine runner instance override policy, expected to be in JSON format.	`string`	`""`	no
docker_machine_spot_price_bid	Spot price bid.	`string`	`"0.06"`	no
docker_machine_version	Version of docker-machine. The version will be ingored once `docker_machine_download_url` is set.	`string`	`"0.16.2"`	no
enable_asg_recreation	Enable automatic redeployment of the Runner ASG when the Launch Configs change.	`bool`	`true`	no
enable_cloudwatch_logging	Boolean used to enable or disable the CloudWatch logging.	`bool`	`true`	no
enable_docker_machine_ssm_access	Add IAM policies to the docker-machine instances to connect via the Session Manager.	`bool`	`false`	no
enable_eip	Enable the assignment of an EIP to the gitlab runner instance	`bool`	`false`	no
enable_forced_updates	DEPRECATED! and is replaced by `enable_asg_recreation. Setting this variable to true will do the oposite as expected. For backward compatibility the variable will remain some releases. Old desription: Enable automatic redeployment of the Runner ASG when the Launch Configs change.`	`string`	`null`	no
enable_gitlab_runner_ssh_access	Enables SSH Access to the gitlab runner instance.	`bool`	`false`	no
enable_kms	Let the module manage a KMS key, logs will be encrypted via KMS. Be-aware of the costs of an custom key.	`bool`	`false`	no
enable_manage_gitlab_token	Boolean to enable the management of the GitLab token in SSM. If `true` the token will be stored in SSM, which means the SSM property is a terraform managed resource. If `false` the Gitlab token will be stored in the SSM by the user-data script during creation of the the instance. However the SSM parameter is not managed by terraform and will remain in SSM after a `terraform destroy`.	`bool`	`true`	no
enable_ping	Allow ICMP Ping to the ec2 instances.	`bool`	`false`	no
enable_runner_ssm_access	Add IAM policies to the runner agent instance to connect via the Session Manager.	`bool`	`false`	no
enable_runner_user_data_trace_log	Enable bash xtrace for the user data script that creates the EC2 instance for the runner agent. Be aware this could log sensitive data such as you GitLab runner token.	`bool`	`false`	no
enable_schedule	Flag used to enable/disable auto scaling group schedule for the runner instance.	`bool`	`false`	no
environment	A name that identifies the environment, used as prefix and for tagging.	`string`	n/a	yes
gitlab_runner_registration_config	Configuration used to register the runner. See the README for an example, or reference the examples in the examples directory of this repo.	`map(string)`	{ "access_level": "", "description": "", "locked_to_project": "", "maximum_timeout": "", "registration_token": "", "run_untagged": "", "tag_list": "" }	no
gitlab_runner_security_group_ids	A list of security group ids that are allowed to access the gitlab runner agent	`list(string)`	`[]`	no
gitlab_runner_ssh_cidr_blocks	List of CIDR blocks to allow SSH Access to the gitlab runner instance.	`list(string)`	`[]`	no
gitlab_runner_version	Version of the GitLab runner.	`string`	`"13.1.1"`	no
instance_role_json	Default runner instance override policy, expected to be in JSON format.	`string`	`""`	no
instance_type	Instance type used for the GitLab runner.	`string`	`"t3.micro"`	no
kms_deletion_window_in_days	Key rotation window, set to 0 for no rotation. Only used when `enable_kms` is set to `true`.	`number`	`7`	no
kms_key_id	KMS key id to encrypted the CloudWatch logs. Ensure CloudWatch has access to the provided KMS key.	`string`	`""`	no
log_group_name	Option to override the default name (`environment`) of the log group, requires `enable_cloudwatch_logging = true`.	`string`	`null`	no
metrics_autoscaling	A list of metrics to collect. The allowed values are GroupDesiredCapacity, GroupInServiceCapacity, GroupPendingCapacity, GroupMinSize, GroupMaxSize, GroupInServiceInstances, GroupPendingInstances, GroupStandbyInstances, GroupStandbyCapacity, GroupTerminatingCapacity, GroupTerminatingInstances, GroupTotalCapacity, GroupTotalInstances.	`list(string)`	`null`	no
overrides	This maps provides the possibility to override some defaults. The following attributes are supported: `name_sg` overwrite the `Name` tag for all security groups created by this module. `name_runner_agent_instance` override the `Name` tag for the ec2 instance defined in the auto launch configuration. `name_docker_machine_runners` ovverrid the `Name` tag spot instances created by the runner agent.	`map(string)`	{ "name_docker_machine_runners": "", "name_runner_agent_instance": "", "name_sg": "" }	no
permissions_boundary	Name of permissions boundary policy to attach to AWS IAM roles	`string`	`""`	no
runner_ami_filter	List of maps used to create the AMI filter for the Gitlab runner docker-machine AMI.	`map(list(string))`	{ "name": [ "ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-*" ] }	no
runner_ami_owners	The list of owners used to select the AMI of Gitlab runner docker-machine instances.	`list(string)`	[ "099720109477" ]	no
runner_iam_policy_arns	List of policy ARNs to be added to the instance profile of the runners.	`list(string)`	`[]`	no
runner_instance_ebs_optimized	Enable the GitLab runner instance to be EBS-optimized.	`bool`	`true`	no
runner_instance_spot_price	By setting a spot price bid price the runner agent will be created via a spot request. Be aware that spot instances can be stopped by AWS.	`string`	`null`	no
runner_root_block_device	The EC2 instance root block device configuration. Takes the following keys: `delete_on_termination`, `volume_type`, `volume_size`, `encrypted`, `iops`	`map(string)`	`{}`	no
runner_tags	Map of tags that will be added to runner EC2 instances.	`map(string)`	`{}`	no
runners_additional_volumes	Additional volumes that will be used in the runner config.toml, e.g Docker socket	`list`	`[]`	no
runners_concurrent	Concurrent value for the runners, will be used in the runner config.toml.	`number`	`10`	no
runners_ebs_optimized	Enable runners to be EBS-optimized.	`bool`	`true`	no
runners_environment_vars	Environment variables during build execution, e.g. KEY=Value, see runner-public example. Will be used in the runner config.toml	`list(string)`	`[]`	no
runners_executor	The executor to use. Currently supports `docker+machine` or `docker`.	`string`	`"docker+machine"`	no
runners_gitlab_url	URL of the GitLab instance to connect to.	`string`	n/a	yes
runners_iam_instance_profile_name	IAM instance profile name of the runners, will be used in the runner config.toml	`string`	`""`	no
runners_idle_count	Idle count of the runners, will be used in the runner config.toml.	`number`	`0`	no
runners_idle_time	Idle time of the runners, will be used in the runner config.toml.	`number`	`600`	no
runners_image	Image to run builds, will be used in the runner config.toml	`string`	`"docker:18.03.1-ce"`	no
runners_limit	Limit for the runners, will be used in the runner config.toml.	`number`	`0`	no
runners_machine_autoscaling	Set autoscaling parameters based on periods, see https://docs.gitlab.com/runner/configuration/advanced-configuration.html#the-runnersmachine-section	list(object({ periods = list(string) idle_count = number idle_time = number timezone = string }))	`[]`	no
runners_max_builds	Max builds for each runner after which it will be removed, will be used in the runner config.toml. By default set to 0, no maxBuilds will be set in the configuration.	`number`	`0`	no
runners_monitoring	Enable detailed cloudwatch monitoring for spot instances.	`bool`	`false`	no
runners_name	Name of the runner, will be used in the runner config.toml.	`string`	n/a	yes
runners_off_peak_idle_count	Deprecated, please use `runners_machine_autoscaling`. Off peak idle count of the runners, will be used in the runner config.toml.	`string`	`-1`	no
runners_off_peak_idle_time	Deprecated, please use `runners_machine_autoscaling`. Off peak idle time of the runners, will be used in the runner config.toml.	`string`	`-1`	no
runners_off_peak_periods	Deprecated, please use `runners_machine_autoscaling`. Off peak periods of the runners, will be used in the runner config.toml.	`string`	`null`	no
runners_off_peak_timezone	Deprecated, please use `runners_machine_autoscaling`. Off peak idle time zone of the runners, will be used in the runner config.toml.	`string`	`null`	no
runners_output_limit	Sets the maximum build log size in kilobytes, by default set to 4096 (4MB)	`number`	`4096`	no
runners_post_build_script	Commands to be executed on the Runner just after executing the build, but before executing after_script.	`string`	`""`	no
runners_pre_build_script	Script to execute in the pipeline just before the build, will be used in the runner config.toml	`string`	`""`	no
runners_pre_clone_script	Commands to be executed on the Runner before cloning the Git repository. this can be used to adjust the Git client configuration first, for example.	`string`	`""`	no
runners_privileged	Runners will run in privileged mode, will be used in the runner config.toml	`bool`	`true`	no
runners_pull_policy	pull_policy for the runners, will be used in the runner config.toml	`string`	`"always"`	no
runners_request_concurrency	Limit number of concurrent requests for new jobs from GitLab (default 1)	`number`	`1`	no
runners_request_spot_instance	Whether or not to request spot instances via docker-machine	`bool`	`true`	no
runners_root_size	Runner instance root size in GB.	`number`	`16`	no
runners_services_volumes_tmpfs	n/a	list(object({ volume = string options = string }))	`[]`	no
runners_shm_size	shm_size for the runners, will be used in the runner config.toml	`number`	`0`	no
runners_token	Token for the runner, will be used in the runner config.toml.	`string`	`"__REPLACED_BY_USER_DATA__"`	no
runners_use_private_address	Restrict runners to the use of a private IP address	`bool`	`true`	no
runners_volumes_tmpfs	n/a	list(object({ volume = string options = string }))	`[]`	no
schedule_config	Map containing the configuration of the ASG scale-in and scale-up for the runner instance. Will only be used if enable_schedule is set to true.	`map`	{ "scale_in_count": 0, "scale_in_recurrence": "0 18 * * 1-5", "scale_out_count": 1, "scale_out_recurrence": "0 8 * * 1-5" }	no
secure_parameter_store_runner_token_key	The key name used store the Gitlab runner token in Secure Parameter Store	`string`	`"runner-token"`	no
ssh_key_pair	Set this to use existing AWS key pair	`string`	`null`	no
subnet_id_runners	List of subnets used for hosting the gitlab-runners.	`string`	n/a	yes
subnet_ids_gitlab_runner	Subnet used for hosting the GitLab runner.	`list(string)`	n/a	yes
tags	Map of tags that will be added to created resources. By default resources will be tagged with name and environment.	`map(string)`	`{}`	no
userdata_post_install	User-data script snippet to insert after GitLab runner install	`string`	`""`	no
userdata_pre_install	User-data script snippet to insert before GitLab runner install	`string`	`""`	no
vpc_id	The target VPC for the docker-machine and runner instances.	`string`	n/a	yes

Outputs

Name	Description
runner_agent_role_arn	ARN of the role used for the ec2 instance for the GitLab runner agent.
runner_agent_role_name	Name of the role used for the ec2 instance for the GitLab runner agent.
runner_agent_sg_id	ID of the security group attached to the GitLab runner agent.
runner_as_group_name	Name of the autoscaling group for the gitlab-runner instance
runner_cache_bucket_arn	ARN of the S3 for the build cache.
runner_cache_bucket_name	Name of the S3 for the build cache.
runner_eip	EIP of the Gitlab Runner
runner_role_arn	ARN of the role used for the docker machine runners.
runner_role_name	Name of the role used for the docker machine runners.
runner_sg_id	ID of the security group attached to the docker machine runners.