SusQL is a Kubernetes operator that aggregates energy and estimated carbon dioxide emission data for pods tagged with SusQL specific labels. The energy measurements are taken from Kepler which should be installed/deployed in the cluster before using SusQL. Watch a video with a demonstration by clicking on the image bellow.
SusQL is an operator that can be deployed in a Kubernetes/OpenShift cluster. You can use kind or minikube to get a local cluster for testing, or run against a remote cluster.
By default SusQL calculates carbon dioxide emission in grams of CO2 using a carbon intensity value from US EPA.
Detailed information on configuration of CO2 emission calculation in SusQL is available in the SusQL carbon calculation documentation.
Kepler is assumed to be installed in the cluster.
-
Follow these instructions for easy SusQL installation from the Red Hat Community Operator catalog on an OpenShift cluster.
-
Follow these instructions to install the SusQL Operator from OperatorHub.io on a Kubernetes cluster including OpenShift.
-
Follow these instructions to install the SusQL Operator with Helm on a Kubernetes cluster including OpenShift.
To begin using SusQL, a LabelGroup
is used to specify the set of labels that the controller uses to identify pods that belong to the same energy aggregation. An example of a LabelGroup
could be:
apiVersion: susql.ibm.com/v1
kind: LabelGroup
metadata:
name: labelgroup-name
namespace: default
spec:
labels:
- my-label-1
- my-label-2
A pod that would be part of the group of pods belonging to the same energy aggregation would specify the LabelGroup
labels as:
apiVersion: v1
kind: Pod
metadata:
name: pod-name
labels:
susql.label/1: my-label-1
susql.label/2: my-label-2
spec:
containers:
- name: container
image: ubuntu
command: ["sleep"]
args: ["infinity"]
Energy of the group of pods is exposed in two ways:
- Through Prometheus at
http://prometheus-susql.openshift-kepler-operator.svc.cluster.local:9090
using the querysusql_total_energy_joules{susql_label_1=my-label-1,susql_label_2=my-label-2}
- From
status
of theLabelGroup
CRD given aslabelgroup.status.totalEnergy
- A step by step explanation of how to aggregate a GPU based Jupyter Notebook workload on OpenShift AI.
Copyright 2023, 2024.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.