Service Scaler

Introducing "Service Scaler”, a kubernetes operator which pro-actively monitors and controls the HPA object of a corresponding deployment enabling gradual scaling of workloads based on a time based configuration.

The Configuration (CRD)

“Time-based” scaling is controlled by a custom configuration which looks like:

    apiVersion: scaler.udaan.io/v1
    kind: ServiceScaler
    metadata:
      name: dummy-acorn-service
      namespace: prod
    spec:
      hpa:
        maxReplicas: 8
        minReplicas: 4
        targetCPUUtilization: 50
        targetMemoryUtilization: 75
      timeRangeSpec:
      - kind: ZonedTime
    	from: 16:00+05:30
        to: 00:00+05:30
        replicaSpec:
          hpa:
            minReplicas: 3
            targetMemoryUtilization: 0
      - kind: ZonedTime
        from: 00:00+05:30
        to: 08:00+05:30
        replicaSpec:
          hpa:
            minReplicas: 2
            targetMemoryUtilization: 0

What does the above configuration mean?
- between 16:00IST - 00:00IST minReplicas is overridden to 3 and targetMemoryUtilzation is removed.
- between 00:00IST - 08:00IST minReplicas is overridden to 2 and targetMemoryUtilzation is removed.
- defaults under hpa: are applied if no time range matches.

The Control knobs

hpa parameters
- minReplicas
- maxReplicas
- targetCPUUtilization (0 would mean removal of cpu based scaling)
- targetMemoryUtiliization (0 would mean removal of memory based scaling)
Defaults under the hpa: section
Overrides under timeRangeSpec: , specify any of the above parameter overrides which will be applied during the specified time range.
Time range controls for from: and to:
- ZonedTime: HH:MM<tz-offset> Ex: 08:00+05:30
- ZonedDateTime: rfc3339 format Ex: 2023-01-11T08:00:00+05:30
Defaults are applied when no time range matches.

The Kill Switch

For those rare instances when things might not go as planned, a kill switch has been crafted. By adding a simple annotation to the HPA, the Service Scaler can be bypassed, putting control back in the hands of the user.

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  annotations:
    service-scaler.kubernetes.io/managed: "false" # <-- THIS LINE
  name: dummy-acorn-service
  namespace: prod
spec:
  maxReplicas: 8
  minReplicas: 4
  scaleTargetRef:
    apiVersion: apps/v2beta2
    kind: Deployment
    name: dummy-acorn-service
  targetCPUUtilizationPercentage: 50

Once the above annotation is added, time based scaling is disabled for dummy-acorn-service, users are expected to manually set hpa parameters of their choice.

The “status” sub resource

The status block of the service scaler object shows the following:

What was the last active configuration of the scaler object?
When was the scaler object last updated?
Is there a time range spec match? (considering the current timestamp)

status:
  lastKnownConfig:
    maxReplicas: 8
    minReplicas: 4
    targetCPUUtilization: 50
    targetMemoryUtilization: 75
  lastObservedGeneration: 1
  lastUpdatedTime: 2024-01-19T11:40Z+0530
  timeRangeMatch: false

Installation

Have a kubernetes cluster up and running.

Install the CRD

kubectl --context=<context> create -f servicescaler.scaler.udaan.io.yaml

Ensure that rbac is setup (refer rbac template)
Build using cargo build
Run using RUST_LOG=info cargo run
Flexibility to watch a subset of hpas are provided via the LABEL_SELECTOR environment variable.

Example

After installing the CRD and running the operator, to see the service scaler in action, let's create a sample deployment called dummy-bee-service with a service scaler object with the following specification:

default - 3 replicas
16:00 - 00:00 - 2 replicas
00:00 - 08:00 - 1 replica

apply the example and examine if the replicas of dummy-bee-service are following the overrides.

kubectl --context=<context> apply -f example.yaml

Points to note

Do not specify “overlapping” time ranges as this will result in undefined behaviour.
Refer architecture diagram to understand the mechanics of the operator.
Battle-tested on kubernetes 1.16 and 1.22.

For newer kubernetes clusters (Ex: 1.30)

pin the following versions for kube and k8s-openapi

kube = { version = "0.93.1", default-features = true, features = ["derive", "runtime", "config"]}
k8s-openapi = { version = "0.22.0", features = ["latest"]}

migrate from autoscaling/v2beta2 to autoscaling/v2
migrate from apps/v2beta2 to apps/v1

Deployment Strategy (k8s)

Build the docker image.
Push the image to a container registry.
Setup a service account with the corresponding rolebinding objects with the required permissions.
Create a deployment object with the pushed image.

Future Work

Helmify the operator for easier deployment.
Capability to "hibernate" services.

udaan-com/service-scaler