This is a tool to query multiple metrics from a prometheus database through the REST API, and save them into a csv file.
Needless to say, we have lots of requirements towards docker monitoring. And what's more, we want to dig into metrics exporting methods. In order to do some data analysis, we want to extract metrics in a usable/simple format (e.g. CSV). It's easy to query some metrics, but the tools paid less attention to export multiple metrics joined by timestamp(this can be some scenarios about data analysis, or dev daily report), then prometheus2csv was born. More info about 2 basic docker monitoring solutions, you can visit my blog here.
prometheus2csv
is a command line tool for Python 3. Basic useage is as follows:
python prometheus2csv.py -h http://prometheus:9090 -c blc_server -o test.csv -s 10s --period=120
http://prometheus:9090
is your Prometheus server's address, blc_server
is the name of the container which you want to query multiple metrics, test.csv
is the target csv file's name(default is result.csv), 10s
indicates query resolution step width in Prometheus query API, 120
means that you will get 120 minutes data which is in the most recent period.
Required arguments:
- -h / --host: Prometheus server address
- -c / --container: The name of the container which you want to query multiple metrics
Optional arguments:
- -o / --outfile: Query result's CSV filename, default is
result.csv
- -s / --step: Query resolution step width in Prometheus query API, default is
10s
- --period: Indicate this to get most recent period's data, it's an integer, in minute. For example, when you use
--period=120
, you will get the data from 120 mins ago till now - --start: Start time in the query, use timestamp or rfc3339
- --end: End time in the query, use timestamp or rfc3339
- --help: Get the basic help info
Attention: you can use start&end OR period to query data, if these 3 arguments all exist, period
will be the priority.
-------------------------------------------------
| timestame + metric1 + metric2 + metric3 + ... |
-------------------------------------------------
| xxxxxxxxx + value1 + value2 + value3 + ... |
-------------------------------------------------
| ... + ... + ... + ... + ... |
-------------------------------------------------
Mainly use Prometheus HTTP API to get the result (query and query_range).
It's a little tricky here, because Prometheus doesn't provide any query functions to get this info, but we can query like this:
http://prometheus:9090/api/v1/query?query=sum by(__name__)({name="blc_server"})
Then you will get all the metrics' names(get them by __name__
) relevant to container 'blc_server'. Hence timestamp + these_metrics_names
will be our csv file's header.
Construct series of query urls based on the names of metrics, then we can get timestamp-value info for every metric. Use a dictionary to store these data, timestamp can be keys and each key corresponds to a list.
As far as I know, Prometheus doesn't provide such method to get multiple metrics' values joined by timestamp, so we have to make series of queries as well. Sometimes you can get similar csv file in Grafana, this is an end-user method talked in my blog
-
Assume that you have run some containers which you want to monitor
-
Run cadvisor in a container
sudo docker run \
--volume=/:/rootfs:ro \
--volume=/var/run:/var/run:rw \
--volume=/sys:/sys:ro \
--volume=/var/lib/docker/:/var/lib/docker:ro \
-p 8080:8080 \
--detach=true \
--name=cadvisor \
google/cadvisor:latest
- Run Prometheus and set up a job about fetching metrics from cAdvisor
- First of all, we touch a new configure file
## prometheus.yml ##
global:
scrape_interval: 15s # By default, scrape targets every 15 seconds.
evaluation_interval: 15s # By default, scrape targets every 15 seconds.
# scrape_timeout is set to the global default (10s).
# Attach these labels to any time series or alerts when communicating with
# external systems (federation, remote storage, Alertmanager).
external_labels:
monitor: 'blc-monitor'
scrape_configs:
- job_name: 'prometheus'
scrape_interval: 5s
static_configs:
- targets: ['localhost:9090']
- job_name: 'cadvisor'
# Override the global default and scrape targets from this job every 5 seconds.
scrape_interval: 5s
static_configs:
- targets: ['cadvisor:8080']
- Start Prometheus as container service, and link it to cadvisor
sudo docker run -d -p 9090:9090 \
-v /path/to/prometheus.yml:/etc/prometheus/prometheus.yml \
--link cadvisor:cadvisor \
--name=prometheus \
prom/prometheus \
--config.file=/etc/prometheus/prometheus.yml
Now you can use prometheus2csv to export your monitoring data, cheers!
- Generate a config file for the first run: then you can update the query easily, and you can also choose some of the metrics you are interested in, instead of query all metrics.
- Multiple query jobs: maybe we need to query series of containers' monitoring data, so it's better to run prometheus2csv one time and get all the info you want (maybe with multiple csv files, classified by containers' names).
- For some metrics in Prometheus (acturally from cAdvisor), they have the same
__name__
, but some other labels' names differ. For example, you might get series of results forcontainer_fs_io_time_seconds_total
, because the container might have manydevice
values. prometheus2csv should handle this circumstances.