fluxcd/flagger

Webhooks are not triggered when installing on local kubernetes cluster

saninsignify opened this issue · 0 comments

Describe the bug

Using a local kubernetes cluster via Docker Desktop and following the instruction process defined here - https://docs.flagger.app/install/flagger-install-on-kubernetes , along with a local LinkerD setup defined here https://linkerd.io/2.16/getting-started/ , when setting up a test canary deployment with podinfo, the webhooks to flaggerloadtester are never called and flagger always removes the canary and promotes to primary.

To Reproduce

  1. Set up a local kubernetes cluster using docker desktop. (I did it on a Mac)
  2. Install LinkerD - https://linkerd.io/2.16/getting-started/ and LinkerD Viz Dashboard
  3. Install Flagger - https://docs.flagger.app/install/flagger-install-on-kubernetes and the flagger-loadtester
  4. Do a deployment of podinfo image 6.6.1 via Kustomize or a deployment.yaml
  5. Do the canary deployment defined below.
  6. Change the deployment of podinfo to image 6.7.0 and observe that canary status is stuck in "Initializing" and that the webhook for flagger-loadtester is never called and the deployment automatically gets promoted.

Canary Deployment file

apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
name: podinfo
namespace: test
spec:

deployment reference

targetRef:
apiVersion: apps/v1
kind: Deployment
name: podinfo
progressDeadlineSeconds: 60
service:
# ClusterIP port number
port: 80
# container port number or name (optional)
targetPort: 9898
skipAnalysis: false
analysis:
# schedule interval (default 60s)
interval: 10s
# max number of failed metric checks before rollback
threshold: 3
# A/B test interactions
# iterations: 1
maxWeight: 15
stepWeight: 5
stepWeightPromotion: 50
# Linkerd Prometheus checks
metrics:
- name: request-success-rate
thresholdRange:
min: 50
interval: 1m
- name: request-duration
thresholdRange:
max: 500
interval: 30s
webhooks:
- name: "confirmation gate"
type: confirm-promotion
url: http://flagger-loadtester.test/gate/halt

Expected behavior

6.7.0 is still stuck in canary because the "confirm-promotion" gate is getting a 403 from http://flagger-loadtester.test/gate/halt

Additional context

  • Flagger version: 1.38
  • Kubernetes version: 1.29.5
  • Service Mesh provider: LinkerD edge-24.8.2
  • Ingress provider: none