uschtwill/docker_monitoring_logging_alerting

No data in Grafana main overview

Closed this issue · 5 comments

This is amazing work! Thank you very much for sharing! Yet I see no graphs (no lines on the graphs) in Grafana "main overview".

I use
setup.sh unsecure

All images show as running with docker ps. I have pulled your image and started it as is with 0 changes to it. Is there something I am missing that I need to change? Only thing I see is "CPU usage on node".
"data exploration" does show some graphs.

cAdvisor shows graphs on 8080
Prometheus front is working
AlertManager front is working

I am running debian:jessie

Thanks

Hey @pir1981,

that is very weird, especially given that some graphs are displaying data and that cAdvisor is showing graphs. Have you tried fiddling with the queries (Graph > Edit > Metrics)? Although if some are working out of the box, all of them should... have you checked Kibana, is there anything peculiar in the logs for the monitoring containers?

Cheers
Will

Hi @uschtwill

where do I find queries (Graph > Edit > Metrics)?

initially I had the following error today after clean restart

failed to collect filesystem stats - rootDiskErr: <nil>, rootInodeErr: cmd [find /rootfs/media/hd2/docker-service/docker/aufs/diff/fcbc7e7635dd035ef0ab267141d3038ab690722103882ca919c7e80f80925923 -xdev -printf .] failed. stderr: find: unrecognized: -printf

which I know from own projects with cAdvisor. It has to do something with the latest version 0.25.0. and it says across the board "no data" in grafana. When I used 0.24.1 in my own deployment (different project), everything was fine.

So I specified this now also in your compose, which made that error go away, but now I got other stuff.

image

but cAdvisor is running just fine. So is Prometheus not reachable?

kibana is still new to me, so finding my way around this.
Discover > log level error
gives me the following, and all for filebeat container_name

time="2017-04-26T18:33:17.041587569+02:00" level=error msg="Failed to log msg \"[2017-04-26T16:33:17,040][WARN ][o.e.d.e.NodeEnvironment ] ES has detected the [path.data] folder using the cluster name as a folder [/usr/share/elasticsearch/data], **Elasticsearch 6.0** will not allow the cluster name as a folder within the data path\" for logger gelf: gelf: cannot send GELF message: write udp 172.16.0.1:52990->172.16.0.38:12201: write: connection refused"

strange as I use your yml with elastic 5.1.1??

time="2017-04-26T18:33:16.2926 22055+02:00" level=error msg="containerd: notify OOM events" error="cgroup path for memory not found"

time="2017-04-26T18:33:17.247320596+02:00" level=error msg="Failed to log msg \"[2017-04-26T16:33:17,246][INFO ][o.e.n.Node ] version[5.1.1], pid[1], build[5395e21/2016-12-06T12:36:15.409Z], OS[Linux/3.16.0-4-amd64/amd64], JVM[Oracle Corporation/OpenJDK 64-Bit Server VM/1.8.0_111/25.111-b14]\" for logger gelf: gelf: cannot send GELF message: write udp 172.16.0.1:52990->172.16.0.38:12201: write: connection refused"

Also should I see something on the dashboard? When I try

Dashboard > Add >

 Pie Chart: Log Count - Container Groups and Container Names
 Stacked Bar Chart: Log Count - by Container Names
 Stacked Bars: Error and Debug Level Log Count

I get no resuts found

I decided to run your setup.sh in foreground doing the following manually

docker network create --subnet=172.16.0.0/24 monitoring_logging

docker-compose -f monitoring/docker-compose.unsecure.yml up --force-recreate

master-nodeexporter_1  | WARNING: no logs are available with the 'gelf' log driver
grafana_1              | WARNING: no logs are available with the 'gelf' log driver
master-cadvisor_1      | WARNING: no logs are available with the 'gelf' log driver
alertmanager_1         | WARNING: no logs are available with the 'gelf' log driver
prometheus_1           | WARNING: no logs are available with the 'gelf' log driver

I read on the net that these warnings should be fine and can be ignored

docker-compose -f logging/docker-compose.unsecure.yml up --force-recreate

master-filebeat_1           | WARNING: no logs are available with the 'gelf' log driver
kibana_1                    | WARNING: no logs are available with the 'gelf' log driver
elasticsearch_1             | WARNING: no logs are available with the 'gelf' log driver
logstash_1                  | WARNING: no logs are available with the 'gelf' log driver
curator_1                   | WARNING: no logs are available with the 'gelf' log driver
logging_curator_1 exited with code 2
validate-logstash-config_1  | log4j:WARN No appenders could be found for logger (io.netty.util.internal.logging.InternalLoggerFactory).
validate-logstash-config_1  | log4j:WARN Please initialize the log4j system properly.
validate-logstash-config_1  | log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
logging_validate-logstash-config_1 exited with code 0

So it seems that it starts up fine. Any idea about the above error logs?

I found the error in the prometheus.yml target assignments. It is working now. I pushed a proposed change in the code.

Glad you got it going. Would be really happy to include your fix, but didn't get a PR?!