apache/superset

[SIP-149] Proposal for Kubernetes Operator for Apache Superset

villebro opened this issue · 10 comments

[SIP-149] Proposal for Kubernetes Operator for Apache Superset

Motivation

Apache Superset's Helm chart [1] [2] is widely used and receives regular contributions, reflecting the popularity of Kubernetes-based deployments within the community. However, Helm's reliance on static templates, duplicated code, lack of built-in testing frameworks, and limited support for advanced lifecycle management makes maintenance of the Helm chart opaque, error prone, and can cause significant downtime risks in large scale deployments relying on it.

This proposal introduces a Kubernetes operator [3], offering a Kubernetes-native approach to managing Superset deployments. The operator will provide similar configuration options to the Helm chart, while addressing its limitations and introducing features like better testing, observability and automation. This proposal aligns with the approach taken by other Apache projects, such as Apache Flink [4] [5] and Apache Druid [6] [7], whose communities have embraced operators to manage their deployments more effectively.

Proposed Change

The operator will introduce a Custom Resource Definition (CRD) [8] for managing Superset deployments declaratively. Key features include:

  1. Helm-Aligned Configuration: A configuration model similar to the Helm chart, exposing commonly needed configuration options.
  2. Enhanced Observability: Built-in support for metrics collection, making it easier to monitor key operator related metrics (reconciliation successes/failures, durations etc).
  3. Improved Lifecycle Management: Laying the groundwork for advanced features like staged upgrades, rollbacks, and downgrades, which are currently not possible using the Helm chart.
  4. Enhanced Testing: The operator will leverage the Operator SDK [9] testing framework, making it easier to validate bug fixes and improvements while ensuring greater reliability and maintainability over time.

New or Changed Public Interfaces

  1. Kubernetes CRD: A Superset CRD for declarative configuration. This structure will be similar to the values.yaml in the current Helm chart.
  2. Operator Image: Docker image for the operator, built using the Go-based Operator SDK.
  3. Deployment Artifacts: YAML manifests for deploying the operator, with optional OLM support.

image
Figure 1. A Superset deployment based on the current Helm Chart, where Helm renders manifests based on the values.yaml file and Helm chart, and applies them to the target namespace.

image
Figure 2. Diagram depicting the proposed operator based flow, where the operator is deployed in its own namespace, and continuously reconciles the desired state in the custom Superset resources. The CRD ensures that the Superset manifests are valid and applies defaults as needed.

New dependencies

The operator will rely on the Go-based Operator SDK [10] for its implementation and testing framework. Beyond this, it will share the same core dependencies as the existing Helm chart, such as Kubernetes APIs and configurations, but without requiring Helm as a dependency.

Migration Plan and Compatibility

Migrating from the Helm chart to the operator will be straightforward, as the operator’s CRD will closely align with the structure of the current values.yaml used in the Helm chart. Additionally, the resources created by the operator will closely mimic those generated by the Helm chart, ensuring consistency and familiarity. Administrators already familiar with managing Superset via Helm will find the transition intuitive.

Benefits

  1. Kubernetes-Native Management: A clean CRD and continuous reconciliation provide a more natural Kubernetes experience.
  2. Dynamic Lifecycle Features: The operator lays the foundation for advanced features like staged upgrades and automated recovery. These are difficult to achieve using the current Helm-based approach.
  3. Enhanced Observability: Prometheus-compatible metrics make it easy to monitor the operator and Superset deployments.
  4. Improved Testing: Operator SDK enables comprehensive testing, both full integration tests and light weight unit tests, improving reliability.
  5. Helm Independence: Users can deploy Superset without relying on Helm.

Proposed Operator Scope and Deprecation of Helm Chart

We propose deprecating the Helm chart once the Kubernetes operator is deemed stable to avoid the burden of maintaining both. The operator will also exclude reconciliation support for PostgreSQL and Redis. Users can continue using Helm for these services or adopt dedicated operators [11] [12], ensuring a more focused approach for managing Superset.

Rejected Alternatives

  1. Enhancing the Helm Chart: Helm is limited in its ability to support advanced lifecycle features, testing, dynamic reconciliation, and observability.
  2. Standalone Scripts: Scripts lack maintainability and alignment with Kubernetes-native workflows.
  3. Existing Operators: No open-source operators provide a clean CRD or are aligned with Superset’s Helm chart configurations.