/service-level-reporting

Calculate SLI/SLO metrics from ZMON's timeseries data

Primary LanguagePythonOtherNOASSERTION

ZMON source code on GitHub is no longer in active development. Zalando will no longer actively review issues or merge pull-requests.

ZMON is still being used at Zalando and serves us well for many purposes. We are now deeper into our observability journey and understand better that we need other telemetry sources and tools to elevate our understanding of the systems we operate. We support the OpenTelemetry initiative and recommended others starting their journey to begin there.

If members of the community are interested in continuing developing ZMON, consider forking it. Please review the licence before you do.

ZMON Service Level Reporting

WORK IN PROGRESS

Calculate SLI/SLO metrics using ZMON's KairosDB timeseries database.

Idea:

  • Retrieve metrics such as latencies and error counts from KairosDB
  • Aggregate metrics, weighted by requests/s
  • Push metrics truncated to full minute timestamps into PostgreSQL
  • Generate reliability reports (weekly, monthly, ..)

Server local setup

Development

Prepare the database

docker run --name slo-pg -d -p 5432:5432 postgres:9.5
echo 'CREATE DATABASE slr' | psql -h localhost -U postgres
export DATABASE_URI=postgresql://postgres@localhost/slr
export KAIROSDB_URL=https://kairosdb.example.org

Run migration

export FLASK_APP=app/main.py
export SLR_LOCAL_ENV=true
pip3 install -r requirements.txt
flask db upgrade -d app/migrations/

Run the server

python3 -m app

Configuration parameters:

OAUTH2_ACCESS_TOKENS_URL
Token endpoint URL.
CREDENTIALS_DIR
Folder with OAuth application credentials (client.json and user.json).
DATABASE_URI
PostgreSQL database connection string.
KAIROSDB_URL
KairosDB base URL.

Docker compose

You can deploy a server environment with docker-compose

$ docker-compose up

Generating Reports

You will need to install gnuplot as a system dependency. Running the following command will generate a report for the specified project in output directory. You will need zmon-slr CLI to be installed (next section)

$ zmon-slr report create myproduct

Command Line Interface

You can interact with API service using CLI tool zmon-slr.

Examples:

$ python setup.py install

$ zmon-slr -h

Usage: zmon-slr [OPTIONS] COMMAND [ARGS]...

  Service Level Reporting command line interface

Options:
  -h, --help  Show this message and exit.

Commands:
  configure  Configure CLI
  group      SLR product groups
  product    SLR products
  sli        Service level indicators
  slo        Service level objectives
  target     Service level objectives Targets

$ zmon-slr group create "Monitoring Inc." "Tech Infrastructure"
Creating product_group: Monitoring Inc.
{
  "created": "2017-06-19T12:31:44.665459Z",
  "department": "Tech Infrastructure",
  "updated": "2017-06-19T12:31:44.665473Z",
  "slug": "monitoring-inc",
  "name": "Monitoring Inc.",
  "uri": "http://localhost:8080/api/product-groups/1",
  "username": "username"
}
 OK

$ zmon-slr group list
[
  {
    "created": "2017-06-19T12:31:44.665459Z",
    "department": "Tech Infrastructure",
    "updated": "2017-06-19T12:31:44.665473Z",
    "slug": "monitoring-inc",
    "name": "Monitoring Inc.",
    "uri": "http://localhost:8080/api/product-groups/1",
    "username": "username"
  }
]

$ zmon-slr product create ZMON monitoring-inc
Creating product: ZMON
{
  "product_reports_uri": "http://localhost:8080/api/products/1/reports",
  "product_reports_weekly_uri": "http://localhost:8080/api/products/1/reports/weekly",
  "username": "username",
  "slug": "zmon",
  "product_slo_uri": "http://localhost:8080/api/products/1/slo",
  "updated": "2017-06-19T12:34:51.818225Z",
  "product_group_uri": "http://localhost:8080/api/product-groups/1",
  "product_group_name": "Monitoring Inc.",
  "name": "ZMON",
  "product_sli_uri": "http://localhost:8080/api/products/1/sli",
  "uri": "http://localhost:8080/api/products/1",
  "created": "2017-06-19T12:34:51.818210Z"
}
 OK

$ zmon-slr product delete zmon
Deleting product: zmon
 OK

$ zmon-slr group delete monitoring-inc
Deleting product_group: monitoring-inc
 OK