Losing data after prometheus restart
momania opened this issue · 5 comments
I seem to lose most historical data whenever I restart my Prometheus server.
I'm running both Prometheus and the Postgres Adapter as a docker container, so I assume it has something to do with local cached data vs. when Prometheus will actually use the remote read through the Postgres Adapter.
Is this behaviour observed before, and is there some more docs/insights on how to setup Prometheus with the Postgres Adapter (other than just the write/read urls) ?
I'm not sure where to start debugging this as I don't know if this is either a Prometheus problem, a Postgres Adapter problem, or even a TimescaleDB problem.
Hi @momania,
since docker containers are ephemeral you need to mount a local host volume to a container so all the data will be saved on you local machine https://docs.docker.com/storage/volumes/
There is a tutorial that shows basic steps on how to get started
http://docs.timescale.com/v0.10/tutorials/prometheus-adapter
Ah I missed that tutorial, just went with the readme on this repo.
Figured out in the end as well that Prometheus needs at least a mounted volume to keep the index and some of the stats in its local db.
Still a shame though, would be so nice if it can just rebuild its index from the remote storage somehow, so the container running Prometheus can be completely stateless (and thus easy to move around a cluster)
Hope they will improve on this in future releases. I'll check if this is already in the pipeline, otherwise shoot it in as a feature/improvement request on the Prometheus repo.
Thanks for the info and the direction @niksajakovljevic 👍
You're welcome. There is a blog post I've published recently on how you integrate Prometheus with PostgreSQL/TimescaleDB: https://blog.timescale.com/sql-nosql-data-storage-for-prometheus-devops-monitoring-postgresql-timescaledb-time-series-3cde27fd1e07
Have the same issue.. After restarting prometheus I have gaps in the graphs:
https://upload.wiuwiu.de/share.php/566e520d-aa03-45b3-b956-7ed4d502a70e/Screenshot%20from%202019-02-09%2016-59-13.png
Same problem. After restarting server lost last ~2 hours of logs (no problem with older logs). Nothing important it's testing machine for prometheus. Typical basic prometheus.yml file from tutorials.
root@lcx:~/promo# uname -a
Linux lcx 4.15.0-47-generic #50-Ubuntu SMP Wed Mar 13 10:44:52 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
root@lcx:~/promo# ./prometheus --version
prometheus, version 2.8.1 (branch: HEAD, revision: 4d60eb36dcbed725fcac5b27018574118f12fffb)
build user: root@bfdd6a22a683
build date: 20190328-18:04:08
go version: go1.11.6
Getting warnings:
Fist restart:
level=warn ts=2019-04-04T10:50:46.677339638Z caller=head.go:450 component=tsdb msg="unknown series references" count=1821739
Second restart:
level=warn ts=2019-04-04T10:54:32.894463993Z caller=head.go:450 component=tsdb msg="unknown series references" count=1858139
Issue is similar to this one.
prometheus/prometheus#2390