Documentation

Description and Architecture

This module was created to simplify configuring logging collecting and aggregation using Vector with intermediate cache in AWS S3 and final destination in AWS OpenSearch (former ElasticSearch).

In the above diagram, you can see the components and their relations.

Resources you need to provide (pre-requisitions)

OpenSearch cluster where Vector Aggregator will ingest processed logs.
Secrets Manager Secret with vector_username and vector_password, which will be used as Basic Auth credentials to OpenSearch cluster. This is optional, if not provided Vector Aggregator will use IAM Role for put data into OpenSearch. If you accessing OpenSearch cluster via VPN Endpoint make sure you provided appropriate host name via the Header. See example in examples/iam-auth-opensearch.
EKS cluster where Vector Agent and Vector Aggregator will be deployed.

Resources created by the module

Two components will be installed into the existing EKS cluster using the Helm chart:
- Vector Agent, as DaemonSet, collects logs and metrics from the node (including application logs) and sends them to the S3 bucket using the ServiceAccount and connected IAM Role. It has been done to decouple logs collecting as far as our OpenSearch cluster can be in the maintenance or overloaded, but we don't want to miss any logs. S3 bucket is cheap and reliable storage, we are using it as a buffer.
- Vector Aggregator, as StatefullSet, which reads the SQS queue for new logs, gets them from the S3 bucket, processes, enriches, and sends them to the OpenSearch cluster.
S3 bucket has lifecycle policy and stores logs for a period of time (7 days by default). It has notifications configured on CreateObject event routed to the SQS queue. Resource policy restricts access to the bucket only to Vector Agent IAM Role and Vector Aggregator IAM Role.
SQS queue, as destination for S3 CreateObject notifications used by the Vector Aggregator to get information about logs messages that has to be processed.
Vector Agent IAM Role and Vector Aggregator IAM Role created to provide granular access to the AWS resources.

Explanation of some architectural decisions

The S3 bucket is used as temporary storage here to not lose any logs in a case when the OpenSearch cluster is in the "Red health status". Messages wait in the SQS queue for later processing. When logs are ingested into the OpenSearch cluster, we remove the message from the SQS queue.
When we have multiple AWS accounts (one per environment), we are using a single OpenSearch cluster. We believe that the Vector Aggregator must be part of each cluster where logs are generated so we can test changes of the Vector Aggregator configuration in the lower environments before promoting them to the Production.

Requirements

Name	Version
terraform	>= 1.0
aws	>= 5.0.0
helm	>= 2.11.0
utils	>= 1.12.0

Providers

Name	Version
aws	5.44.0
helm	2.13.0
utils	1.19.2

Modules

Name	Source	Version
vector_agent_role	terraform-aws-modules/iam/aws//modules/iam-assumable-role-with-oidc	v5.38.0
vector_aggregator_role	terraform-aws-modules/iam/aws//modules/iam-assumable-role-with-oidc	v5.38.0
vector_s3_bucket	terraform-aws-modules/s3-bucket/aws	4.1.0
vector_sqs	terraform-aws-modules/sqs/aws	4.1.0

Resources

Name	Type
aws_iam_policy.vector_agent	resource
aws_iam_policy.vector_aggregator	resource
aws_s3_bucket_notification.object_created	resource
helm_release.vector_agent	resource
helm_release.vector_aggregator	resource
aws_eks_cluster.eks	data source
aws_iam_policy_document.s3_bucket_policy	data source
aws_iam_policy_document.vector_agent	data source
aws_iam_policy_document.vector_aggregator	data source
aws_region.current	data source
aws_secretsmanager_secret.opensearch_credentials	data source
aws_secretsmanager_secret_version.current	data source
utils_deep_merge_yaml.values_agent_merged	data source
utils_deep_merge_yaml.values_aggregator_merged	data source

Inputs

Name	Description	Type	Default	Required
agent_template_filename	Filename with custom agent configuration.	`string`	`null`	no
agent_template_variables	By default agent template has following variables: `region`, `bucket`, and `cluster_name`. Module replaces them inside automatically. If you defined additional variables in the template provided via `agent_template_filename` you need to provide values for them here.	`map(string)`	`{}`	no
agent_values_override	Overrides or extend the Agent Helm default values and/or thouse provided in the custom template `agent_template_filename`.	`any`	`{}`	no
aggregator_template_filename	Filename with custom aggregator configuration.	`string`	`"aggregator.yaml"`	no
aggregator_template_variables	By default aggregator template has following variables: `queue_url`, `endpoint`, `region`, and `eks_cluster_name`. Module replaces them inside automatically. If you defined additional variables in the template provided via `aggregator_template_filename` you need to provide values for them here.	`map(string)`	`{}`	no
aggregator_values_override	Overrides or extend the Aggregator Helm values provided in the custom template `aggregator_template_filename`.	`any`	`{}`	no
eks_cluster_name	Name of the EKS cluster where Vector going to be installed.	`string`	n/a	yes
elasticsearch_endpoint	Endpoint of OpenSearch (ElasticSearch) to which we are sending logs in format `https://elasticsearch.example.com:443`.	`string`	n/a	yes
helm_chart_config	Helm chart config. Applies to both agent and aggregator. See https://registry.terraform.io/providers/hashicorp/helm/latest/docs	`any`	`{}`	no
name	Name or prefix for resources that will be created.	`string`	`"vector"`	no
s3_bucket_name	By default S3 bucket name generates as `$var.name-$var.eks_cluster_name-logs`. You can override it here by providing custom name.	`string`	`null`	no
s3_bucket_policy	By default we are creating the least priviledgies S3 bucket policy (limited access only for `Agent IAM Role` and `Aggregator IAM Role`). You can override it by providing S3 bucket policy JSON document here.	`string`	`null`	no
s3_expiration_days	How many days keep log files in S3 bucket.	`number`	`7`	no
s3_force_destroy	A boolean that indicates all objects should be deleted from the bucket first to destroy the bucket without error.	`bool`	`false`	no
secret_name	Secret which contains `vector_username` and `vector_password` we are using to perform Basic Authenification in OpenSearch (ElasticSearch).	`string`	`null`	no
tags	A mapping of tags to assign to the resources.	`map(string)`	`{}`	no

Outputs

Name	Description
vector_agent_role	IAM Role ARN created for Vector agent
vector_aggregator_role	IAM Role ARN created for Vector aggregator
vector_s3_bucket_id	S3 bucket created to store logs before they parsed
vector_sqs_name	SQS created to collect events from S3 and pass to vector aggregator

opsworks-co/vector-eks-s3-opensearch