/grafana-statusmap

Grafana panel plugin to visualize status of multiple objects over time

Primary LanguageTypeScriptMIT LicenseMIT

Statusmap panel for Grafana

Download from GitHub GH Discussions Telegram chat RU Follow Flant on Twitter

Panel to show discrete statuses of multiple targets over time.

Statusmap sample panel

Statusmap sample panel with dark theme

Run docker compose up and open http://localhost:3000 in browser to see a simple demo.

Features

  • Grouping values into rows and buckets using legend from query
  • User defined color mapping
  • Multiple values in bucket are displayed via tooltip
  • Configurable tooltip items
  • Pagination for rows
  • Increasing rows/buckets' interval for better visual representation
  • Representing null values as empty bucket or zero value

Supported environment

Motivation

We had a desperate need to visualize a set of timeseries statuses over time period, so we can see a history of changes for objects' status. Since we maintain a lot of Kubernetes clusters (and related infrastructure), our main cases for that are visualization of servers & Kubernetes pods health states as well as HTTP services health checks. We've tried a variety of Grafana plugins available (they are listed in Acknowledgements below) but none of them could provide the features and visualization really close to what we've been looking for.

NB: You can find more details about our journey of creating the plugin in this post.

Objects being visualized with this plugin may be different: not only IT components (e.g. server hosts and Kubernetes pods) but just anything you can imagine like coffee makers on the picture above. These objects should have discrete statuses which are sets of predefined values, e.g. ok = 0, off = 1, fail = 2, etc.

Configuration

Datasource notes

To create neat graphs your datasource should return good data. Plugin adjust $__interval variable depending on bucket width in panel options. Your queries should aggregate statuses over $__interval.

To make multiple values mode works as expected you should define multiple queries: one query for each possible status.

Plugin doesn't aggregate data in time for now, it only renders input data as buckets. Because of this data should contain points for each timestamp in time range and equal timestamps for every possible target (y-axis label). This limitation is addressed by issue #53.

Prometheus

To work with data from Prometheus you will need to setup discrete statuses for your objects. Requirements to store these statuses in metrics are as follows:

  • metrics should have two values: 0 and 1;
  • there should be a label with status' value.

When it's done, you can collect all the data via query, e.g.:

(max_over_time(coffee_maker_status{status="<STATUS_VALUE>"}[$__interval]) == 1) * <STATUS_VALUE>

If there was no such status (<STATUS_VALUE>) during query's interval, Prometheus would return nothing. Otherwise, status' value will be returned.

For example, if you have 5 types of statuses and a metric (coffee_maker_status) with 5 allowed values (0, 1, 2, 3, 4), you should transform this metric using following rule:

- record: coffee_maker_status:discrete
  expr: |
    count_values("status", coffee_maker_status)

That's how coffee_maker_status metric with value 3 will be transformed into new metric:

coffee_maker_status:discrete{status="3"} 1

Now, when Prometheus has 0 and 1 values for each status, all these metrics can be aggregated, so you will get all available statuses of your objects over time.

InfluxDB

Choose 'Time series' for 'Format as' and use GROUP BY ($__interval) in query. $tag_<tag name> can be used in 'Alias by' to define y-axis labels.

Mysql

Example query with aggregation over $__interval is like this (you need one query for each possible status value):

SELECT
  $__timeGroupAlias(date_insert,$__interval),
  name AS metric,
  min(statusi) AS "statusi"
FROM coffee_makers
WHERE
  $__timeFilter(date_insert) AND statusi=1
GROUP BY 1,2
ORDER BY $__timeGroup(date_insert,$__interval)

metric column is used as y-axis label.

Panel

First of all, an individual query for each possible status' value should be created. Each query should also have similar legend for grouping:

Query setup

Then, color mapping for status' values should be defined in Discrete color mode:

Color mapping

Use can use presets to define a trafic light colors or 8 colors from solarized palette:

Color mapping empty

Color mapping trafic lights

Note: Spectrum and Opacity color modes function the same way they do in Heatmap plugin.

More options

Bucket

Bucket options

Multiple values checkbox specifies how they should be displayed:

  • If it's off, multiple values for one bucket are treated as error;
  • If it's on, color for such bucket would be determined by the value having least index in color mapping.

Color mapping

Display nulls can be treated as empty buckets or displayed with the color of 0 value.

Color mapping

Min width and spacing are used to specify minimal bucket width and spacing between buckets. Rounding may be used to round edges.

Min width, spacing, rounding 1

Min width, spacing, rounding 2

Values index set to positive number to display only values from specified timeseries.

Display

Display options

Show legend checkbox toggles legend at the bottom of the panel.

Rows sort can be used to sort labels on Y axis. Metrics — sort y labels as they are defined on Metrics tab. a→z and z→a sort labels descending or ascending in a natural order.

Pagination

Pagination controls

Enable pagination toggles pagination controls on graph.

Rows per page a number of rows to display on graph.

Tooltip

Tooltip in frozen state

Show tooltip toggles tooltip display on mouse over buckets.

Freeze on click toggles tooltip "freezing" on click. Frozen tooltip can be used to compare data with floating tooltip or to follow URLs.

Show items toggles display of additional items in tooltip.

Items is a list of definitions to display URLs in tooltip.

Each URL has a template, icon, label and formating options: lowercase and date format for variables.

Tooltip items editor

Percentual bucket spans

In some cases scenarios, the status panel is used to go from one dashboard to a more specific one, e.g. when performing root/cause analysis. In such cases, the user may want to reduce the scope of the time span, while keeping the desired event centered (to be able to analyse the previous/posterior buckets of time). Those values are introduced through __bucket_from and __bucket_to.

__bucket_from: It's the value of bucket.from minus the percentual bucket. __bucket_to: It's the value of bucket.to plus the percentual bucket.

Learn more

Acknowledgements

The first public release of this plugin has been fully made by Flant engineers. The whole idea has come from Dmitry Stolyarov (@distol), initial version has been written by Sergey Gnuskov (@gsmetal) and final changes has been made by Ivan Mikheykin (@diafour).

This plugin is based on "Heatmap" panel by Grafana and partly inspired by ideas from Carpet plot, Discrete panel, Status Panel, Status Dot, Status By Group.