fluxcd/flagger

Flagger not honoring destination rules for locality-based routing

Opened this issue · 0 comments

Describe the bug

Currently working on a use case using EKS, Istio, and Flagger to manage canary deployments. We have a requirement to restrict cross-AZ traffic, so we have implemented locality-based routing in Istio.

However, when creating a canary deployment with Flagger, it does not appear to respect the destination rule configured for locality-based routing. This results in traffic routing across availability zones, which we aim to avoid.

Is this behavior expected, or is this a feature that is not currently supported? Any guidance or suggested workarounds would be appreciated.

Below are the config file
gateway.yaml

apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
  name: gateway
  namespace: default
spec:
  selector:
    istio: gateway
  servers:
  - port:
      number: 80
      name: http
      protocol: HTTP
    hosts:
    - test-app.example.com

virutal-service.yaml

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: test-app-vs
  namespace: default
spec:
  hosts:
  - "test-app.example.com"
  gateways:
  - gateway
  http:
  - route:
    - destination:
        host: test-app-svc.default.svc.cluster.local
        port:
          number: 8090

destination-rule.yaml

apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
  name: svc-test-app-dr
  namespace: default
spec:
  host: test-app-svc.default.svc.cluster.local
  trafficPolicy:
    loadBalancer:
      localityLbSetting:
        enabled: true
        distribute:
        - from: us-east-1/us-east-1a/*
          to:
            "us-east-1/us-east-1a/*": 100
        - from: us-east-1/us-east-2b/*
          to:
            "us-east-2/us-east-2b/*": 100
    outlierDetection:
      consecutiveGatewayErrors: 1
      interval: 30s
      baseEjectionTime: 30s

canary-deploy.yaml

apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
  name: test-app-canary
  namespace: default
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: test-app-deploy
  autoscalerRef:
    apiVersion: autoscaling/v2
    kind: HorizontalPodAutoscaler
    name: test-app-hpa
  service:
    port: 8090
    targetPort: 8090
    gateways:
    - gateway
    hosts:
    - test-app.example.com
    retries:
      attempts: 3
      perTryTimeout: 1s
      retryOn: "gateway-error,connect-failure,refused-stream"
  analysis:
    interval: 1m
    threshold: 5
    maxWeight: 20
    stepWeight: 10
    metrics:
    - name: request-success-rate
      thresholdRange:
        min: 99
      interval: 1m
    - name: request-duration
      thresholdRange:
        max: 500
      interval: 1m

To Reproduce

If you deploy the above canary object it creates the virtual-service like this

apiVersion: networking.istio.io/v1
kind: VirtualService
metadata:
  annotations:
    helm.toolkit.fluxcd.io/driftDetection: disabled
    kustomize.toolkit.fluxcd.io/reconcile: disabled
  generation: 1
  name: test-app-deploy
  namespace: default
  ownerReferences:
  - apiVersion: flagger.app/v1beta1
    blockOwnerDeletion: true
    controller: true
    kind: Canary
    name: test-app-canary
    uid: fb70ba8f-63ff-4c98-9080-4bf100057f84
  resourceVersion: "3285188"
  uid: 07682b7b-9acd-4c43-8259-28d90d04c802
spec:
  gateways:
  - gateway
  hosts:
  - test-app.example.com
  - test-app-deploy
  http:
  - retries:
      attempts: 3
      perTryTimeout: 1s
      retryOn: gateway-error,connect-failure,refused-stream
    route:
    - destination:
        host: test-app-deploy-primary
      weight: 100
    - destination:
        host: test-app-deploy-canary
      weight: 0

And 2 destination rules one for primary and other one for canary

apiVersion: networking.istio.io/v1
kind: DestinationRule
metadata:
  generation: 1
  name: test-app-deploy-primary
  namespace: default
  ownerReferences:
  - apiVersion: flagger.app/v1beta1
    blockOwnerDeletion: true
    controller: true
    kind: Canary
    name: test-app-canary
spec:
  host: test-app-deploy-primary
apiVersion: networking.istio.io/v1
kind: DestinationRule
metadata:
  generation: 1
  name: test-app-deploy-canary
  namespace: default
  ownerReferences:
  - apiVersion: flagger.app/v1beta1
    blockOwnerDeletion: true
    controller: true
    kind: Canary
    name: test-app-canary
spec:
  host: test-app-deploy-canary

This shows that its not respecting the existing destination rules.

Expected behavior

The canary deployment should respect the destination rule for locality-based routing, keeping traffic within the specified availability zones.

Additional context

  • Flagger version: 1.38.0
  • Kubernetes version: AWS EKS - 1.30
  • Service Mesh provider: Istio
  • Ingress provider: ALB controller