rycus86/prometheus_flask_exporter

Is it possible to give a metric an inital value?

schelv opened this issue · 3 comments

I want to monitor (the rate of) api calls to a microservice.
The problem that I run into is that a metric is only available after the first HTTP request.
This means that the first time that Prometheus sees the metric of an endpoint the value will be 1.
The consequence is that Prometheus cannot 'correctly' calculate the rate or increase for the metric for the first HTTP request since as far as Prometheus knows the value has always been 1.

Is it possible to make the metric available before the first request? (e.g. start counting at zero).

/metrics output before any api call.

# HELP python_gc_objects_collected_total Objects collected during gc
# TYPE python_gc_objects_collected_total counter
python_gc_objects_collected_total{generation="0"} 413.0
python_gc_objects_collected_total{generation="1"} 0.0
python_gc_objects_collected_total{generation="2"} 0.0
# HELP python_gc_objects_uncollectable_total Uncollectable object found during GC
# TYPE python_gc_objects_uncollectable_total counter
python_gc_objects_uncollectable_total{generation="0"} 0.0
python_gc_objects_uncollectable_total{generation="1"} 0.0
python_gc_objects_uncollectable_total{generation="2"} 0.0
# HELP python_gc_collections_total Number of times this generation was collected
# TYPE python_gc_collections_total counter
python_gc_collections_total{generation="0"} 77.0
python_gc_collections_total{generation="1"} 7.0
python_gc_collections_total{generation="2"} 0.0
# HELP python_info Python platform information
# TYPE python_info gauge
python_info{implementation="CPython",major="3",minor="10",patchlevel="4",version="3.10.4"} 1.0
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 7.2278016e+07
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 3.3284096e+07
# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1.66747764407e+09
# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 2.54
# HELP process_open_fds Number of open file descriptors.
# TYPE process_open_fds gauge
process_open_fds 10.0
# HELP process_max_fds Maximum number of open file descriptors.
# TYPE process_max_fds gauge
process_max_fds 1.048576e+06
# HELP exporter_info Information about the Prometheus Flask exporter
# TYPE exporter_info gauge
exporter_info{version="0.20.3"} 1.0
# HELP http_request_duration_seconds Flask HTTP request duration in seconds
# TYPE http_request_duration_seconds histogram
# HELP http_request_total Total number of HTTP requests
# TYPE http_request_total counter
# HELP http_request_exceptions_total Total number of HTTP requests which resulted in an exception
# TYPE http_request_exceptions_total counter
# HELP by_path_counter_total Request count by request paths
# TYPE by_path_counter_total counter

/metrics output after first api call

# HELP python_gc_objects_collected_total Objects collected during gc
# TYPE python_gc_objects_collected_total counter
python_gc_objects_collected_total{generation="0"} 413.0
python_gc_objects_collected_total{generation="1"} 0.0
python_gc_objects_collected_total{generation="2"} 0.0
# HELP python_gc_objects_uncollectable_total Uncollectable object found during GC
# TYPE python_gc_objects_uncollectable_total counter
python_gc_objects_uncollectable_total{generation="0"} 0.0
python_gc_objects_uncollectable_total{generation="1"} 0.0
python_gc_objects_uncollectable_total{generation="2"} 0.0
# HELP python_gc_collections_total Number of times this generation was collected
# TYPE python_gc_collections_total counter
python_gc_collections_total{generation="0"} 77.0
python_gc_collections_total{generation="1"} 7.0
python_gc_collections_total{generation="2"} 0.0
# HELP python_info Python platform information
# TYPE python_info gauge
python_info{implementation="CPython",major="3",minor="10",patchlevel="4",version="3.10.4"} 1.0
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 7.2278016e+07
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 3.3284096e+07
# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1.66747764407e+09
# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 2.54
# HELP process_open_fds Number of open file descriptors.
# TYPE process_open_fds gauge
process_open_fds 10.0
# HELP process_max_fds Maximum number of open file descriptors.
# TYPE process_max_fds gauge
process_max_fds 1.048576e+06
# HELP exporter_info Information about the Prometheus Flask exporter
# TYPE exporter_info gauge
exporter_info{version="0.20.3"} 1.0
# HELP http_request_duration_seconds Flask HTTP request duration in seconds
# TYPE http_request_duration_seconds histogram
http_request_duration_seconds_bucket{le="0.005",method="GET",microservice_type="placeholder",model_name="devmodel",path="/api/example",status="503"} 1.0
http_request_duration_seconds_bucket{le="0.01",method="GET",microservice_type="placeholder",model_name="devmodel",path="/api/example",status="503"} 1.0
http_request_duration_seconds_bucket{le="0.025",method="GET",microservice_type="placeholder",model_name="devmodel",path="/api/example",status="503"} 1.0
http_request_duration_seconds_bucket{le="0.05",method="GET",microservice_type="placeholder",model_name="devmodel",path="/api/example",status="503"} 1.0
http_request_duration_seconds_bucket{le="0.075",method="GET",microservice_type="placeholder",model_name="devmodel",path="/api/example",status="503"} 1.0
http_request_duration_seconds_bucket{le="0.1",method="GET",microservice_type="placeholder",model_name="devmodel",path="/api/example",status="503"} 1.0
http_request_duration_seconds_bucket{le="0.25",method="GET",microservice_type="placeholder",model_name="devmodel",path="/api/example",status="503"} 1.0
http_request_duration_seconds_bucket{le="0.5",method="GET",microservice_type="placeholder",model_name="devmodel",path="/api/example",status="503"} 1.0
http_request_duration_seconds_bucket{le="0.75",method="GET",microservice_type="placeholder",model_name="devmodel",path="/api/example",status="503"} 1.0
http_request_duration_seconds_bucket{le="1.0",method="GET",microservice_type="placeholder",model_name="devmodel",path="/api/example",status="503"} 1.0
http_request_duration_seconds_bucket{le="2.5",method="GET",microservice_type="placeholder",model_name="devmodel",path="/api/example",status="503"} 1.0
http_request_duration_seconds_bucket{le="5.0",method="GET",microservice_type="placeholder",model_name="devmodel",path="/api/example",status="503"} 1.0
http_request_duration_seconds_bucket{le="7.5",method="GET",microservice_type="placeholder",model_name="devmodel",path="/api/example",status="503"} 1.0
http_request_duration_seconds_bucket{le="10.0",method="GET",microservice_type="placeholder",model_name="devmodel",path="/api/example",status="503"} 1.0
http_request_duration_seconds_bucket{le="+Inf",method="GET",microservice_type="placeholder",model_name="devmodel",path="/api/example",status="503"} 1.0
http_request_duration_seconds_count{method="GET",microservice_type="placeholder",model_name="devmodel",path="/api/example",status="503"} 1.0
http_request_duration_seconds_sum{method="GET",microservice_type="placeholder",model_name="devmodel",path="/api/example",status="503"} 0.0003939999733120203
# HELP http_request_duration_seconds_created Flask HTTP request duration in seconds
# TYPE http_request_duration_seconds_created gauge
http_request_duration_seconds_created{method="GET",microservice_type="placeholder",model_name="devmodel",path="/api/example",status="503"} 1.6674777541179776e+09
# HELP http_request_total Total number of HTTP requests
# TYPE http_request_total counter
http_request_total{method="GET",microservice_type="placeholder",model_name="devmodel",status="503"} 1.0
# HELP http_request_created Total number of HTTP requests
# TYPE http_request_created gauge
http_request_created{method="GET",microservice_type="placeholder",model_name="devmodel",status="503"} 1.6674777541180773e+09
# HELP http_request_exceptions_total Total number of HTTP requests which resulted in an exception
# TYPE http_request_exceptions_total counter
# HELP by_path_counter_total Request count by request paths
# TYPE by_path_counter_total counter
by_path_counter_total{method="GET",microservice_type="placeholder",model_name="devmodel",path="/api/example",status="503"} 1.0
# HELP by_path_counter_created Request count by request paths
# TYPE by_path_counter_created gauge
by_path_counter_created{method="GET",microservice_type="placeholder",model_name="devmodel",path="/api/example",status="503"} 1.6674777541178892e+09

Hm, I'm not super sure if it's doable, but it's definitely not currently exposed.
https://github.com/prometheus/client_python#labels has a bit on this:

Metrics with labels are not initialized when declared, because the client can't know what values the label can have. It is recommended to initialize the label values by calling the .labels() method alone

I suppose we could try adding a flag to poke into this initialization function when the metric is registered and see what happens.
I'd be open to see a PR with this perhaps if you're keen?

I created a merge request (#145)
An initial value is now created when all label values are strings.
This was the easiest option =)

I think it is also possible for some of the labels that are determined by a callable.
The only requirement is that all label values should be known.

For most of the labels this should not be a problem, since the endpoint contains information about the label values that can be encountered.
For example the method label, the path (if not generic), and status code.
Maybe this can be done with an approach similar to register_default;
i.e. after all routes have been set up iterate over the routes/endpoints, the metrics of each endpoint, and over the possible values of the labels of that metric.

I could not really find (if and) where the the metrics of each endpoints could be accessed.
So I'm leaving the opportunity to implement this to someone else 😁

This is now released in 0.21.0, thanks again @schelv !