Issue with Canary Deployment: Metric Not Reporting
Closed this issue · 4 comments
I'm implementing a canary deployment using Flagger to monitor my application. The goal is to monitor the success rate of HTTP requests to a health endpoint (/ping). However, despite configuring the request-success-rate metric, Flagger isn't sending any metrics or requests to the endpoint. I am using traefik provider.
apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
name: test-service
namespace: test
spec:
provider: traefik
targetRef:
apiVersion: apps/v1
kind: Deployment
name: test-service
progressDeadlineSeconds: 300
service:
port: 3000
targetPort: 3000
analysis:
interval: 10s
threshold: 10
maxWeight: 50
stepWeight: 5
metrics:
- name: request-success-rate
interval: 30s
thresholdRange:
min: 99
failureThreshold: 5
query: "http://test-service:3000/ping"
webhooks:
- name: acceptance-test
type: pre-rollout
url: http://flagger-loadtester.test/
timeout: 10s
metadata:
type: bash
cmd: "curl -X GET http://test-service:3000/ping"
- name: load-test
type: rollout
url: http://flagger-loadtester.test/
timeout: 5s
metadata:
type: cmd
cmd: "hey -z 10s -q 10 -c 2 http://test-service:3000/ping"
logCmdOutput: "true"
{{- end }}
I tested the curl and hey commands from inside the load tester pod and they work fine. But when I check my canary, it goes in failed status after initialized
Events:
Type Reason Age From Message
Warning Synced 4m19s flagger test-service-primary.test not ready: waiting for rollout to finish: observed deployment generation less than desired generation
Warning Synced 3m29s (x5 over 4m9s) flagger test-service-primary.test not ready: waiting for rollout to finish: 0 of 1 (readyThreshold 100%) updated replicas are available
Normal Synced 3m19s (x7 over 4m19s) flagger all the metrics providers are available!
Normal Synced 3m19s flagger Initialization done! test-service.test
Normal Synced 2m49s flagger New revision detected! Scaling up test-service.test
Warning Synced 119s (x5 over 2m39s) flagger canary deployment test-service.test not ready: waiting for rollout to finish: 0 of 1 (readyThreshold 100%) updated replicas are available
Normal Synced 109s flagger Starting canary analysis for test-service.test
Normal Synced 109s flagger Pre-rollout check acceptance-test passed
Normal Synced 109s flagger Advance test-service.test canary weight 5
Warning Synced 89s (x2 over 99s) flagger Halt advancement no values found for traefik metric request-success-rate probably test-service.test is not receiving traffic: running query failed: no values found
I am not sure if I am missing something.
could you test if the required metrics are showing in your prometheus server?
could you test if the required metrics are showing in your prometheus server?
@aryan9600 I do not have a prometheus server. I am using metrics-server. I was reading more on canary and I think prometheus is a requirement for this setup. But I am running the podinfo canary there(https://github.com/stefanprodan/podinfo) and it works fine even without prometheus. I am not sure why that is working and not my custom service.
The goal is to monitor the success rate of HTTP requests to a health endpoint (/ping).
The query
field is for specifying a PromQL query, see the docs here: https://docs.flagger.app/usage/metrics#prometheus
If you don't use Prometheus, then delete the metrics
field, the webhooks are enough to test the ping endpoint.
The goal is to monitor the success rate of HTTP requests to a health endpoint (/ping).
The
query
field is for specifying a PromQL query, see the docs here: https://docs.flagger.app/usage/metrics#prometheusIf you don't use Prometheus, then delete the
metrics
field, the webhooks are enough to test the ping endpoint.
@stefanprodan the podinfo canary that you created, that is working fine with my setup(without prometheus). I am just wondering how is that working with the metrics
field? And just to confirm, you are saying that I should remove the entire block below?
metrics:
- name: request-success-rate
interval: 30s
thresholdRange:
min: 99
failureThreshold: 5
query: "http://test-service:3000/ping"