hashicorp/consul

Issue with HAProxy as Kubernetes Ingress Controller consul annotations transparent-proxy-exclude-inbound-ports not working 1042/healthz

Opened this issue · 0 comments

Overview of the Issue

This is my flow:
Browser → [AWS NLB] → [haproxy-ingress service] → [Pod (with connect-inject)]

I set haproxy deployment pods annotations as below:

consul.hashicorp.com/connect-inject: "true"
consul.hashicorp.com/transparent-proxy-exclude-inbound-ports: "1024,1042,8080,8443"

haproxy pods (container "kubernetes-ingress-controller") never become started and ready and I see this events:

Readiness probe failed: dial tcp x.x.x.x:20000: connect: connection refused
Started container kubernetes-ingress-controller
Startup probe failed: Get "http://x.x.x.x:20500/healthz": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
Startup probe failed: HTTP probe failed with statuscode: 503

If I set annotation 'consul.hashicorp.com/transparent-proxy-overwrite-probes: "false"'' I can see the issue for real haproxy probles:

Readiness probe failed: dial tcp x.x.x.x:20000: connect: connection refused
Startup probe failed: Get "http://x.x.x.x:1042/healthz": read tcp y.y.y.y:55412->x.x.x.x:1042: read: connection reset by peer
Startup probe failed: Get "http://x.x.x.x:1042/healthz": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
Startup probe failed: Get "http://x.x.x.x:1042/healthz": dial tcp x.x.x.x:1042: connect: connection refused

Pods can start only if I set transparent-proxy to false, but doing so haproxy ingress service cannot authenticate (ACLS + consul intentions) through transparent proxy and returns "502 bad gateway"

consul.hashicorp.com/connect-inject: "true"
consul.hashicorp.com/transparent-proxy: "false"
consul.hashicorp.com/transparent-proxy-exclude-inbound-ports: "1024,1042,6060,8080,8443"

Reproduction Steps

  1. Install haproxy ingress controller 1.42.0 with these annotations for service LoadBalancer:
annotations:
  'service.beta.kubernetes.io/aws-load-balancer-type': 'external'
  'service.beta.kubernetes.io/aws-load-balancer-scheme': 'internet-facing'
  'service.beta.kubernetes.io/aws-load-balancer-nlb-target-type': 'ip'
  'service.beta.kubernetes.io/aws-load-balancer-target-group-attributes': 'preserve_client_ip.enabled=true'                                                                                            
  'service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled': true
  1. Enable Consul Connect using annotations shown above in haproxy ingress controller chart stanza:
     controller:
       podAnnotations:
         consul.hashicorp.com/connect-inject: "true"
         consul.hashicorp.com/transparent-proxy-exclude-inbound-ports: "1024,1042,8080,8443"
  1. See issues for pods (kubectl -n ... describe pod/..., kubectl -n ... logs pod/... -c ...)

Consul info for both Client and Server

Server info
agent:
[13/119248]
        check_monitors = 0
        check_ttls = 0
        checks = 0
        services = 0
build:
        prerelease =
        revision = 920cc7c6
        version = 1.20.1
        version_metadata =
consul:
        acl = enabled
        bootstrap = false
        known_datacenters = 5
        leader = false
        leader_addr = x.x.x.x:8300
        server = true
raft:
        applied_index = 435869
        commit_index = 435869
        fsm_pending = 0
        last_contact = 23.476537ms
        last_log_index = 435869
        last_log_term = 51
        last_snapshot_index = 426059
        last_snapshot_term = 51
        latest_configuration = [{Suffrage:Voter ID:0f0e40cd-33ab-ea1e-f3f8-6b5f50f1ddfe Address:x.x.x.x:8300} {Suffrage:Voter ID:d44171e8-e0e2-6abb-95c3-01f2fc99a918 Address:y.y.y.y:8300} {
Suffrage:Voter ID:d7d3f7d8-a3b5-12ac-8f09-ba8413757bcb Address:z.z.z.z:8300}]
        latest_configuration_index = 0
        num_peers = 2
        protocol_version = 3
        protocol_version_max = 3
        protocol_version_min = 0
        snapshot_version_max = 1
        snapshot_version_min = 0
        state = Follower
        term = 51
runtime:
        arch = amd64
        cpu_count = 2
        goroutines = 449
        max_procs = 2
        os = linux
        version = go1.22.7
serf_lan:
        coordinate_resets = 0
        encrypted = true
        event_queue = 0
        event_time = 20
        failed = 0
        health_score = 0
        intent_queue = 0
        left = 0
        member_time = 1027
        members = 3
        query_queue = 0
        query_time = 1
serf_wan:
        coordinate_resets = 0
        encrypted = true
        event_queue = 0
        event_time = 1
        failed = 0
        health_score = 0
        intent_queue = 0
        left = 0
        member_time = 8448
        members = 15
        query_queue = 0
        query_time = 1

Consul version

helm chart "consul" from repoUrl "https://helm.releases.hashicorp.com"
targetRevision: 1.6.1 (5 Nov, 2024) => consul v1.20.1 (https://github.com/hashicorp/consul/releases/tag/v1.20.1)

Chart configuration

enabled: true
global:
  enabled: true
  datacenter: sbox0milavpc1dc1
  federation:
    enabled: true
    createFederationSecret: true
  gossipEncryption:
    autoGenerate: true
  acls:
    enabled: true
    manageSystemACLs: true
    createReplicationToken: true
  enablePodSecurityPolicies: false
  tls:
    enabled: true
    verify: true
    httpsOnly: true
server:
  enabled: true
  replicas: 3
  storageClass: ebs-csi-gp3-encrypt-retain
  persistentVolumeClaimRetentionPolicy:
    whenDeleted: Retain
    whenScaled: Delete
  resources: |
    requests:
      memory: "200Mi"
      cpu: "100m"
    limits:
      memory: "500Mi"
      cpu: "500m"
  storage: 10Gi
  disruptionBudget:
    enabled: false
ui:
  enabled: true
  service:
    type: ClusterIP
connectInject:
  enabled: true
  default: false
  transparentProxy:
    defaultEnabled: true
    defaultOverwriteProbes: true
  cni:
    enabled: true
    logLevel: info
    cniBinDir: "/opt/cni/bin"
    cniNetDir: "/etc/cni/net.d"
  disruptionBudget:
    enabled: false
meshGateway:
  enabled: true
  replicas: 2
  service:
    type: LoadBalancer
    annotations: |
      'service.beta.kubernetes.io/aws-load-balancer-name': "consul-mgw-sbox0milavpc1dc1-pri"
      'service.beta.kubernetes.io/aws-load-balancer-type': "external"
      'service.beta.kubernetes.io/aws-load-balancer-scheme': "internal"
      'service.beta.kubernetes.io/aws-load-balancer-nlb-target-type': "ip"
      'service.beta.kubernetes.io/aws-load-balancer-backend-protocol': "tcp"
      'service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled': "true"

Operating system and Environment details

Kubernetes on AWS EKS

Log Fragments