/external-scaler-azure-cosmos-db

KEDA External Scaler for Azure Cosmos DB

Primary LanguageC#Apache License 2.0Apache-2.0

KEDA External Scaler for Azure Cosmos DB

Event-based autoscaler for your Azure Cosmos DB change feed consumer applications running inside Kubernetes cluster.

Build Status

Architecture

Following diagram shows the different components that are involved for achieving the application scaling, and the relationships between these components.

Scenario

  • Monitored Container - The Azure Cosmos DB container that the application needs to monitor for new changes. A Cosmos DB container might contain several logical partitions based on the presence of distinct values of partition keys. Different logical partitions will be grouped under the same Partition Range if they are stored on the same physical partition. For more information, please read the documentation on partitioning overview. In general, for containers that do not contain large amount of data, the count of physical partitions does not exceed 1.

  • Lease Container - Another Azure Cosmos DB container that keeps track of changes happening on the monitored container. It stores the list of changes in the Change Feed. The change feed design pattern supports multiple parallel listeners by keeping independent feeds for each partition range. The listener application instances acquire leases on these individual feeds before processing them. This ensures that a change is not processed by multiple applications. You may have both monitored and lease containers in the same Cosmos DB account, but they can also be situated in different accounts.

  • KEDA - KEDA runs as a separate service in Kubernetes cluster. It enables auto-scaling of applications based on internal and more primarily, external events. Check KEDA documentation to learn more.

  • External Scaler - While KEDA ships with a set of built-in scalers, it also allows users to extend KEDA through support for external scalers. In this scheme, KEDA will query user's GRPC service to fetch metrics of an event source and will scale the applications accordingly. This is where 'KEDA external scaler for Azure Cosmos DB' plugs itself in. For information on how an external scaler can be implemented, check KEDA external scaler concept.

  • Listener Application(s) - This represents the application Deployment or StatefulSet that you would like to scale in and out using KEDA and the external scaler. For information on how to setup the change feed processor in your application that processes changes in Cosmos DB container, read documentation on change feed processing.

  • ScaledObject Spec - The specification contains information about the scale target (i.e. the application Deployment that needs to be scaled) and the trigger metadata. The external scaler fetches information about the Cosmos DB lease container from the trigger metadata defined in the ScaledObject resource.

The external scaler calls Cosmos DB APIs to estimate the amount of changes pending to be processed. More specifically, the scaler counts the number of partition ranges that have changes remaining to be processed, and requests KEDA to scale the application to that amount.

Note: The architectural diagram above shows KEDA, external scaler and the target application in different Kubernetes namespaces. This is possible but not necessary. It is a requirement though that the ScaledObject and the application Deployment reside in the same namespace.

Setup Instructions

⚠️ Caution: The Java SDK v2 client library uses a different naming convention for lease documents inside the lease container. This makes it incompatible with .NET SDK v3, the one that the external scaler depends on to estimate the pending changes on change feeds. Hence, if you have a Java-based target consumer application, your change feeds would be having lease documents with incompatible IDs, and the external scaler would be unable to detect any pending change remaining to be consumed. Consequently, it will scale down your application to minReplicaCount if defined in the ScaledObject or to zero instances.

Deploy KEDA and External Scaler

  1. Add and update Helm chart repo.

    helm repo add kedacore https://kedacore.github.io/charts
    helm repo update
  2. Install KEDA Helm chart (or follow one of the other installation methods on KEDA documentation).

    helm install keda kedacore/keda --namespace keda --create-namespace
  3. Install Azure Cosmos DB external scaler Helm chart.

    helm install external-scaler-azure-cosmos-db kedacore/external-scaler-azure-cosmos-db --namespace keda --create-namespace

Create ScaledObject Resource

Create ScaledObject resource that contains the information about your application (the scale target), the external scaler service, Cosmos DB containers, and other scaling configuration values. Check ScaledObject specification and External trigger specification for information on different properties supported for ScaledObject and their allowed values.

You can use file deploy/deploy-scaledobject.yaml as a template for creating the ScaledObject. The trigger metadata properties required to use the external scaler for Cosmos DB are described in Trigger Specification section below.

Note: If you are having trouble setting up the external scaler or the listener application, the step-by-step instructions for deploying the sample application might help.

Trigger Specification

The specification below describes the trigger metadata in ScaledObject resource for using 'KEDA external scaler for Cosmos DB' to scale your application.

  triggers:
    - type: external
      metadata:
        scalerAddress: external-scaler-azure-cosmos-db.keda:4050 # Mandatory. Address of the external scaler service.
        connection: <connection>               # Mandatory. Connection string of Cosmos DB account with monitored container.
        databaseId: <database-id>              # Mandatory. ID of Cosmos DB database containing monitored container.
        containerId: <container-id>            # Mandatory. ID of monitored container.
        leaseConnection: <lease-connection>    # Mandatory. Connection string of Cosmos DB account with lease container.
        leaseDatabaseId: <lease-database-id>   # Mandatory. ID of Cosmos DB database containing lease container.
        leaseContainerId: <lease-container-id> # Mandatory. ID of lease container.
        processorName: <processor-name>        # Mandatory. Name of change-feed processor used by listener application.

Parameter List

  • scalerAddress - Address of the external scaler service. This would be in format <scaler-name>.<scaler-namespace>:<port>. If you installed Azure Cosmos DB external scaler Helm chart in keda namespace and did not specify custom values, the metadata value would be external-scaler-azure-cosmos-db.keda:4050.

  • connection - Connection string of the Cosmos DB account that contains the monitored container.

  • databaseId - ID of Cosmos DB database that contains the monitored container.

  • containerId - ID of the monitored container.

  • leaseConnection - Connection string of the Cosmos DB account that contains the lease container. This can be same or different from the value of connection metadata.

  • leaseDatabaseId - ID of Cosmos DB database that contains the lease container. This can be same or different from the value of databaseId metadata.

  • leaseContainerId - ID of the lease container containing the change feeds.

  • processorName - Name of change-feed processor used by listener application. For more information on this, you can refer to Implementing the change feed processor section.

Note Ideally, we would have created TriggerAuthentication resource that would have prevented us from adding the connection strings in plain text in the ScaledObject trigger metadata. However, this is not possible since at the moment, the triggers of external type do not support referencing a TriggerAuthentication resource (link).