Losing data after prometheus restart

Question

Losing data after prometheus restart

momania opened this issue 7 years ago · 5 comments

I seem to lose most historical data whenever I restart my Prometheus server.

I'm running both Prometheus and the Postgres Adapter as a docker container, so I assume it has something to do with local cached data vs. when Prometheus will actually use the remote read through the Postgres Adapter.

Is this behaviour observed before, and is there some more docs/insights on how to setup Prometheus with the Postgres Adapter (other than just the write/read urls) ?

I'm not sure where to start debugging this as I don't know if this is either a Prometheus problem, a Postgres Adapter problem, or even a TimescaleDB problem.

Answer 1 · 2018-07-16T16:35:49.000Z

Hi @momania,

since docker containers are ephemeral you need to mount a local host volume to a container so all the data will be saved on you local machine https://docs.docker.com/storage/volumes/

There is a tutorial that shows basic steps on how to get started
http://docs.timescale.com/v0.10/tutorials/prometheus-adapter

Answer 2 · 2018-07-16T19:26:45.000Z

Ah I missed that tutorial, just went with the readme on this repo.

Figured out in the end as well that Prometheus needs at least a mounted volume to keep the index and some of the stats in its local db.

Still a shame though, would be so nice if it can just rebuild its index from the remote storage somehow, so the container running Prometheus can be completely stateless (and thus easy to move around a cluster)

Hope they will improve on this in future releases. I'll check if this is already in the pipeline, otherwise shoot it in as a feature/improvement request on the Prometheus repo.

Thanks for the info and the direction @niksajakovljevic 👍

Answer 3 · 2018-07-17T08:00:05.000Z

You're welcome. There is a blog post I've published recently on how you integrate Prometheus with PostgreSQL/TimescaleDB: https://blog.timescale.com/sql-nosql-data-storage-for-prometheus-devops-monitoring-postgresql-timescaledb-time-series-3cde27fd1e07

Answer 4 · 2019-02-09T16:01:41.000Z

Have the same issue.. After restarting prometheus I have gaps in the graphs:
https://upload.wiuwiu.de/share.php/566e520d-aa03-45b3-b956-7ed4d502a70e/Screenshot%20from%202019-02-09%2016-59-13.png

Answer 5 · 2019-04-04T11:05:10.000Z

Same problem. After restarting server lost last ~2 hours of logs (no problem with older logs). Nothing important it's testing machine for prometheus. Typical basic prometheus.yml file from tutorials.

root@lcx:~/promo# uname -a
Linux lcx 4.15.0-47-generic #50-Ubuntu SMP Wed Mar 13 10:44:52 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

root@lcx:~/promo# ./prometheus --version
prometheus, version 2.8.1 (branch: HEAD, revision: 4d60eb36dcbed725fcac5b27018574118f12fffb)
build user: root@bfdd6a22a683
build date: 20190328-18:04:08
go version: go1.11.6
Getting warnings:
Fist restart:
level=warn ts=2019-04-04T10:50:46.677339638Z caller=head.go:450 component=tsdb msg="unknown series references" count=1821739
Second restart:
level=warn ts=2019-04-04T10:54:32.894463993Z caller=head.go:450 component=tsdb msg="unknown series references" count=1858139

Issue is similar to this one.
prometheus/prometheus#2390