/vector-eks-s3-opensearch

This Terraform module simplify setup of the logging collecting and aggregation using Vector to the AWS EKS with intermediate cache in AWS S3 and final destination in AWS OpenSearch

Primary LanguageHCLApache License 2.0Apache-2.0

Documentation

Description and Architecture

This module was created to simplify configuring logging collecting and aggregation using Vector with intermediate cache in AWS S3 and final destination in AWS OpenSearch (former ElasticSearch).

Architectural diagram

In the above diagram, you can see the components and their relations.

Resources you need to provide (pre-requisitions)
  • OpenSearch cluster where Vector Aggregator will ingest processed logs.
  • Secrets Manager Secret with vector_username and vector_password, which will be used as Basic Auth credentials to OpenSearch cluster. This is optional, if not provided Vector Aggregator will use IAM Role for put data into OpenSearch. If you accessing OpenSearch cluster via VPN Endpoint make sure you provided appropriate host name via the Header. See example in examples/iam-auth-opensearch.
  • EKS cluster where Vector Agent and Vector Aggregator will be deployed.
Resources created by the module
  • Two components will be installed into the existing EKS cluster using the Helm chart:
    • Vector Agent, as DaemonSet, collects logs and metrics from the node (including application logs) and sends them to the S3 bucket using the ServiceAccount and connected IAM Role. It has been done to decouple logs collecting as far as our OpenSearch cluster can be in the maintenance or overloaded, but we don't want to miss any logs. S3 bucket is cheap and reliable storage, we are using it as a buffer.
    • Vector Aggregator, as StatefullSet, which reads the SQS queue for new logs, gets them from the S3 bucket, processes, enriches, and sends them to the OpenSearch cluster.
  • S3 bucket has lifecycle policy and stores logs for a period of time (7 days by default). It has notifications configured on CreateObject event routed to the SQS queue. Resource policy restricts access to the bucket only to Vector Agent IAM Role and Vector Aggregator IAM Role.
  • SQS queue, as destination for S3 CreateObject notifications used by the Vector Aggregator to get information about logs messages that has to be processed.
  • Vector Agent IAM Role and Vector Aggregator IAM Role created to provide granular access to the AWS resources.

Explanation of some architectural decisions

  • The S3 bucket is used as temporary storage here to not lose any logs in a case when the OpenSearch cluster is in the "Red health status". Messages wait in the SQS queue for later processing. When logs are ingested into the OpenSearch cluster, we remove the message from the SQS queue.
  • When we have multiple AWS accounts (one per environment), we are using a single OpenSearch cluster. We believe that the Vector Aggregator must be part of each cluster where logs are generated so we can test changes of the Vector Aggregator configuration in the lower environments before promoting them to the Production.

Requirements

Name Version
terraform >= 1.0
aws >= 5.0.0
helm >= 2.11.0
utils >= 1.12.0

Providers

Name Version
aws 5.44.0
helm 2.13.0
utils 1.19.2

Modules

Name Source Version
vector_agent_role terraform-aws-modules/iam/aws//modules/iam-assumable-role-with-oidc v5.38.0
vector_aggregator_role terraform-aws-modules/iam/aws//modules/iam-assumable-role-with-oidc v5.38.0
vector_s3_bucket terraform-aws-modules/s3-bucket/aws 4.1.0
vector_sqs terraform-aws-modules/sqs/aws 4.1.0

Resources

Name Type
aws_iam_policy.vector_agent resource
aws_iam_policy.vector_aggregator resource
aws_s3_bucket_notification.object_created resource
helm_release.vector_agent resource
helm_release.vector_aggregator resource
aws_eks_cluster.eks data source
aws_iam_policy_document.s3_bucket_policy data source
aws_iam_policy_document.vector_agent data source
aws_iam_policy_document.vector_aggregator data source
aws_region.current data source
aws_secretsmanager_secret.opensearch_credentials data source
aws_secretsmanager_secret_version.current data source
utils_deep_merge_yaml.values_agent_merged data source
utils_deep_merge_yaml.values_aggregator_merged data source

Inputs

Name Description Type Default Required
agent_template_filename Filename with custom agent configuration. string null no
agent_template_variables By default agent template has following variables: region, bucket, and cluster_name. Module replaces them inside automatically. If you defined additional variables in the template provided via agent_template_filename you need to provide values for them here. map(string) {} no
agent_values_override Overrides or extend the Agent Helm default values and/or thouse provided in the custom template agent_template_filename. any {} no
aggregator_template_filename Filename with custom aggregator configuration. string "aggregator.yaml" no
aggregator_template_variables By default aggregator template has following variables: queue_url, endpoint, region, and eks_cluster_name. Module replaces them inside automatically. If you defined additional variables in the template provided via aggregator_template_filename you need to provide values for them here. map(string) {} no
aggregator_values_override Overrides or extend the Aggregator Helm values provided in the custom template aggregator_template_filename. any {} no
eks_cluster_name Name of the EKS cluster where Vector going to be installed. string n/a yes
elasticsearch_endpoint Endpoint of OpenSearch (ElasticSearch) to which we are sending logs in format https://elasticsearch.example.com:443. string n/a yes
helm_chart_config Helm chart config. Applies to both agent and aggregator. See https://registry.terraform.io/providers/hashicorp/helm/latest/docs any {} no
name Name or prefix for resources that will be created. string "vector" no
s3_bucket_name By default S3 bucket name generates as $var.name-$var.eks_cluster_name-logs. You can override it here by providing custom name. string null no
s3_bucket_policy By default we are creating the least priviledgies S3 bucket policy (limited access only for Agent IAM Role and Aggregator IAM Role). You can override it by providing S3 bucket policy JSON document here. string null no
s3_expiration_days How many days keep log files in S3 bucket. number 7 no
s3_force_destroy A boolean that indicates all objects should be deleted from the bucket first to destroy the bucket without error. bool false no
secret_name Secret which contains vector_username and vector_password we are using to perform Basic Authenification in OpenSearch (ElasticSearch). string null no
tags A mapping of tags to assign to the resources. map(string) {} no

Outputs

Name Description
vector_agent_role IAM Role ARN created for Vector agent
vector_aggregator_role IAM Role ARN created for Vector aggregator
vector_s3_bucket_id S3 bucket created to store logs before they parsed
vector_sqs_name SQS created to collect events from S3 and pass to vector aggregator