Prometheus+Grafana
Opened this issue · 3 comments
使用
How to Write Rules for Prometheus
How To Monitor Linux Servers Using Prometheus Node Exporter
Monitoring your Linux Servers with Prometheus and Grafana in 7 Minutes
How to Monitor Linux Server Performance with Prometheus and Grafana in 5 minutes
Install Prometheus Server on CentOS 7 and Ubuntu 18.04
使用 promethues 和 grafana 监控自己的 linux 机器
参考链接
https://yunlzheng.gitbook.io/prometheus-book/parti-prometheus-ji-chu/alert/prometheus-alert-rule
https://awesome-prometheus-alerts.grep.to/rules.html
https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/
https://alex.dzyoba.com/blog/prometheus-alerts/
https://gist.github.com/devops-school/98d7eed1a9df6c372c45452730791f7a
https://www.metricfire.com/blog/top-5-prometheus-alertmanager-gotchas/
https://www.weave.works/blog/labels-in-prometheus-alerts-think-twice-before-using-them
https://yunlzheng.gitbook.io/prometheus-book/parti-prometheus-ji-chu/alert/prometheus-alert-rule
https://softwareadept.xyz/2018/01/how-to-write-rules-for-prometheus/
https://blog.networktocode.com/post/prometheus_alerting/
https://www.devopsschool.com/blog/recording-rules-and-alerting-rules-exmplained-in-prometheus/
https://blog.csdn.net/shida_csdn/article/details/81980021
https://gitlab.cern.ch/paas-tools/monitoring/prometheus-webhook-receiver/-/tree/master
https://superuser.com/questions/443406/how-can-i-produce-high-cpu-load-on-a-linux-server
https://yunlzheng.gitbook.io/prometheus-book/parti-prometheus-ji-chu/alert/alert-manager-config
https://www.jianshu.com/p/fd0b018539cd
https://help.aliyun.com/document_detail/123117.html?utm_content=g_1000230851&spm=5176.20966629.toubu.3.f2991ddcpxxvD1#h2-alertmanagers88
https://github.com/prometheus/alertmanager/blob/master/api/v2/openapi.yaml
https://github.com/gin-gonic/gin#quick-start
https://www.programmersought.com/article/50413971111/
https://songjiayang.gitbooks.io/prometheus/content/configuration/rule_files.html
刷新reload配置
[root@localhost prometheus-2.19.1.linux-amd64]# curl -v -X POST http://172.16.59.100:9090/-/reload
- About to connect() to 172.16.59.100 port 9090 (#0)
- Trying 172.16.59.100...
- Connected to 172.16.59.100 (172.16.59.100) port 9090 (#0)
POST /-/reload HTTP/1.1
User-Agent: curl/7.29.0
Host: 172.16.59.100:9090
Accept: /
< HTTP/1.1 200 OK
< Date: Thu, 25 Mar 2021 12:43:34 GMT
< Content-Length: 0
<
- Connection #0 to host 172.16.59.100 left intact
[root@localhost prometheus-2.19.1.linux-amd64]#
100 - (avg by(instance) (rate(node_cpu_seconds_total[2m])) * 100) > 80
prometheus的relabel_configs的理解
prometheus的relabel_configs的理解
Kubernetes下的服务发现
Prometheus的服务发现机制
默认情况下,当Prometheus加载Target实例完成后,这些Target时候都会包含一些默认的标签:
上面这些标签将会告诉Prometheus如何从该Target实例中获取监控数据。一般来说,Target以__作为前置的标签是在系统内部使用的,因此这些标签不会被写入到样本数据中。不过这里有一些例外,例如,我们会发现所有通过Prometheus采集的样本数据中都会包含一个名为instance的标签,该标签的内容对应到Target实例的__address__。 这里实际上是发生了一次标签的重写处理。
这种发生在采集样本数据之前,对Target实例的标签进行重写的机制在Prometheus被称为Relabeling。
Relabeling作用时机
Prometheus允许用户在采集任务设置中通过relabel_configs来添加自定义的Relabeling过程。
replace/labelmap/labelkeep/labeldrop对标签进行管理
完整的relabel_config配置如下所示:
__address__:当前Target实例的访问地址<host>:<port>
__scheme__:采集目标服务访问地址的HTTP Scheme,HTTP或者HTTPS
__metrics_path__:采集目标服务访问地址的访问路径
__param_<name>:采集任务目标服务的中包含的请求参数
# The source labels select values from existing labels. Their content is concatenated
# using the configured separator and matched against the configured regular expression
# for the replace, keep, and drop actions.
[ source_labels: '[' <labelname> [, ...] ']' ]
# Separator placed between concatenated source label values.
[ separator: <string> | default = ; ]
# Label to which the resulting value is written in a replace action.
# It is mandatory for replace actions. Regex capture groups are available.
[ target_label: <labelname> ]
# Regular expression against which the extracted value is matched.
[ regex: <regex> | default = (.*) ]
# Modulus to take of the hash of the source label values.
[ modulus: <uint64> ]
# Replacement value against which a regex replace is performed if the
# regular expression matches. Regex capture groups are available.
[ replacement: <string> | default = $1 ]
# Action to perform based on regex matching.
[ action: <relabel_action> | default = replace ]
其中action定义了当前relabel_config对Metadata标签的处理方式,默认的action行为为replace。
replace是根据regex的配置匹配source_labels标签的值(多个source_label的值会按照separator进行拼接),并且将匹配到的值写入到target_label当中,如果有多个匹配组,则可以使用${1}, ${2}确定写入的内容。如果没匹配到任何内容则不对target_label进行重新。如:
- job_name: 'kubernetes-kubelet'
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- role: node
relabel_configs:
- target_label: __address__
replacement: kubernetes.default.svc:443
- source_labels: [__meta_kubernetes_node_name]
regex: (.+)
target_label: __metrics_path__
replacement: /api/v1/nodes/${1}/proxy/metrics
目标标签__metrics_path_的值为/api/v1/nodes/${1}/proxy/metrics。 其中${1}是正则表达式(.+)从__meta_kubernetes_node_name的值中捕获的内容。
而labelmap会根据regex去匹配Target实例所有标签的名称(注意是名称),并且将捕获到的内容作为为新的标签名称,regex匹配到标签的的值作为新标签的值。如:
- job_name: 'kubernetes-nodes'
kubernetes_sd_configs:
- role: node
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
原标签为: __meta_kubernetes_node_label_test=tttt
则目标标签为: test=tttt
使用labelkeep或者labeldrop则可以对Target标签进行过滤,仅保留符合过滤条件的标签,例如:
relabel_configs:
- regex: label_should_drop_(.+)
action: labeldrop
该配置会使用regex匹配当前Target实例的所有标签,并将符合regex规则的标签从Target实例中移除。labelkeep正好相反,会移除那些不匹配regex定义的所有标签。
使用keep/drop过滤Target实例
scrape_configs:
- job_name: node_exporter
consul_sd_configs:
- server: localhost:8500
services:
- node_exporter
relabel_configs:
- source_labels: ["__meta_consul_dc"]
regex: "dc1"
action: keep
上述配置表示只要指标的“__meta_consul_dc”这个标签的值含有“dc1”,就保留这个指标。
当action设置为keep时,Prometheus会丢弃source_labels的值中没有匹配到regex正则表达式内容的Target实例,而当action设置为drop时,则会丢弃那些source_labels的值匹配到regex正则表达式内容的Target实例。
https://opensource.actionsky.com/20200622-prometheus/
https://my.oschina.net/u/4383725/blog/4314559
https://www.cnblogs.com/zhaojiedi1992/p/zhaojiedi_liunx_61_prometheus_relabel.html
https://yunlzheng.gitbook.io/prometheus-book/part-ii-prometheus-jin-jie/sd/service-discovery-with-relabel
https://www.jianshu.com/p/cef0e145d3e0
https://www.iloxp.com/archive/11/