/insights-storage-broker

Moves files from stage bucket to rejected bucket

Primary LanguagePython

Insights Storage Broker

The Insights Storage Broker microservice handles interaction between the platform and remote stores for storage of payloads that pass through the platform.

How it Works

Storage Workflow UML

Insights Storage Broker can be configured to consume from multiple topics by supplying a configuration file in YAML format with the specified topic, bucket, and formatter for the resulting object in cloud storage.

Example Configuration:

platform.upload.validation:                                     # topic
  normalizer: Validation                                        # normalizer (to be covered further down)
platform.upload.buckit:
  normalizer: Openshift
  services:                                                     # list of services
    openshift:                                                  # service (defined by content type)
      format: "{org_id}/{cluster_id}/{timestamp}-{request_id}"  # format of resulting file object
      bucket: "insights-buck-it-openshift"                      # storage bucket
    ansible:
      format: "{org_id}/{cluster_id}/{timestamp}-{request_id}"
      bucket: "insights-buck-it-ansible"

The configuration file allows for new buckets, topics, and formatters to be added to the service without changing the underlying code.

Support for Validation Messages

Insights Storage Broker consumes from the platform.upload.validation topic. Storage broker expects to recieve all the data in the message that the validation service originally recieved in addition to the the validation key.

If a failure message is recieved, Storage Broker will copy the file to the rejected bucket and not advertise the availability of the payload to the platform.

Validation Workflow: UML

Normalizers

In order to make Storage Broker more dynamic, it supports the ability for a user to define their own data normalizer to provide the required keys to the service. Normalizers can be found in normalizers.py and more can be added if need be, or a service can choose to use an existing normalizer.

The only requirmeent for Storage Broker to determine which normalizer to use is that the service key is available in the root of the JSON being parsed. This is provided by ingress messages by default. Any other keys to be used must be added as attributes of your normalizer class.

The only required keys in a normalizer at this point are size, service, and request_id. Any other keys you establish can be used for formatting the resulting object filename.

Local Development

Prerequisites

  • Python 3.11
  • docker-compose

Step 1: Spin-up dependencies

docker-compose -f compose.yml up

Step 2: Start storage broker

pip install .

BOOTSTRAP_SERVERS=localhost:29092 BUCKET_MAP_FILE=default_map.yaml storage_broker

(Optional) Start the storage broker API

storage_broker_api

Step 3: Produce a sample validation message

make produce_validation_message

Local AWS S3 interaction testing using Minio

You can test the AWS interaction of storage broker using Minio. To do so, you will need to set the following environment variables to the storage broker consumer or API.

S3_ENDPOINT_URL=<minio_api_endpoint/>
AWS_ACCESS_KEY_ID=<minio_access_key_id/>
AWS_SECRET_ACCESS_KEY=<minio_secret_access_key/>
STAGE_BUCKET=<bucket_name/>

For example

AWS_ACCESS_KEY_ID=$MINIO_ACCESS_KEY AWS_SECRET_ACCESS_KEY=$MINIO_SECRET_KEY S3_ENDPOINT_URL=minio_api_endpoint:9000 STAGE_BUCKET=insights-dev-upload-perm storage_brokerer_api

Local Minio access keys are provided here.

Authors

Stephen Adams - Initial Work - SteveHNH