HAproxy pods in Kubernetes rapidly consume all available memory

Question

HAproxy pods in Kubernetes rapidly consume all available memory

victor-sudakov opened this issue 3 years ago · 7 comments

I'm trying to run haproxy in a local Kubernetes cluster (kind v0.11.1 go1.17.6 linux/amd64).

Kubernetes version: Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.1", GitCommit:"5e58841cce77d4bc13713ad2b91fa0d961e69192", GitTreeState:"clean", BuildDate:"2021-05-21T23:01:33Z", GoVersion:"go1.16.4", Compiler:"gc", Platform:"linux/amd64"}

I've tried different images: haproxy:2.0.27-buster, haproxy:2.0.27-alpine, haproxy:2.5.1-alpine, haproxy:2.5.1-bullseye and some others, and the result is always the same: the haproxy pod rapidly (!) consumes as much memory as limits permit (and if there is no limit configured, it easily consumes gigabytes of memory). Below please find my test configurations:

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: haproxy
  name: haproxy
spec:
  replicas: 1
  selector:
    matchLabels:
      app: haproxy
  template:
    metadata:
      labels:
        app: haproxy
    spec:
      containers:
        - name: haproxy
          image: haproxy:2.0.27-buster
          resources:
            requests:
              memory: "256Mi"
            limits:
              memory: "512Mi"
          volumeMounts:
          - name: config
            mountPath: /usr/local/etc/haproxy/haproxy.cfg
            subPath: haproxy.cfg
      volumes:
        - name: config
          configMap:
            name: haproxy

global
  log stdout format raw local0 info

defaults
  timeout client  60s
  timeout connect 3s
  timeout server  60s
  timeout check   3s
  maxconn 100

frontend acl
  bind            *:10001
  mode            tcp
  default_backend acl

frontend rm
  bind            *:10002
  mode            tcp
  default_backend rm

backend acl
  mode tcp
  option pgsql-check user acl_app
  server pgsql-test-acl pgsql-test-acl:5432 weight 100 check inter 10000

backend rm
  mode tcp
  option pgsql-check user rm_app
  server pgsql-test-rm pgsql-test-rm:5432 weight 100 check inter 10000

Answer 1 · 2022-02-04T18:31:35.000Z

Probably something with your config, I'm not able to reproduce (although the server being forwarded to isn't there)

haproxy.cfg

global
  log stdout format raw local0 info

defaults
  timeout client  60s
  timeout connect 3s
  timeout server  60s
  timeout check   3s
  maxconn 100

frontend acl
  bind            *:10001
  mode            tcp
  default_backend acl

frontend rm
  bind            *:10002
  mode            tcp
  default_backend rm

backend acl
  mode tcp
  option pgsql-check user acl_app
  server pgsql-test-acl 127.0.0.1:80 weight 100 check inter 10000

backend rm
  mode tcp
  option pgsql-check user rm_app
  server pgsql-test-rm 127.0.0.1:80 weight 100 check inter 10000

$ kubectl create configmap haproxy --from-file=./haproxy.cfg
configmap/haproxy created

$ kubectl apply -f haproxy.yaml
deployment.apps/haproxy created

$ kubectl get po -o wide
NAME                       READY   STATUS    RESTARTS   AGE   IP           NODE   NOMINATED NODE   READINESS GATES
haproxy-868b79979b-7wfzv   1/1     Running   0          28s   10.42.0.13   demo   <none>           <none>

$ kubectl top po haproxy-868b79979b-7wfzv
NAME                       CPU(cores)   MEMORY(bytes)
haproxy-868b79979b-7wfzv   1m           67Mi

$ kubectl logs haproxy-868b79979b-7wfzv
[NOTICE] 034/181541 (1) : New worker #1 (8) forked
[WARNING] 034/181542 (8) : Server acl/pgsql-test-acl is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
[ALERT] 034/181542 (8) : backend 'acl' has no server available!
[WARNING] 034/181546 (8) : Server rm/pgsql-test-rm is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
[ALERT] 034/181546 (8) : backend 'rm' has no server available!

You could also try asking over at the Docker Community Forums, Docker Community Slack, or Stack Overflow. Since these repos aren't really a user-help forum

Answer 2 · 2022-02-05T03:55:04.000Z

Probably something with your config, I'm not able to reproduce (although the server being forwarded to isn't there)

If you cannot reproduce it with my config, obviously there is nothing wrong with my config :-) What flavour/version of Kubernetes were you trying to reproduce in?

Answer 3 · 2022-02-07T05:22:07.000Z

On Kubernetes v1.23.1 (kind v0.11.1) the problem persists. And no, my issue is not a question, it's a bug report. HAproxy outside K8s does not display this behavior.

Answer 4 · 2022-02-07T06:33:16.000Z

minikube v1.24.0 Kubernetes v1.22.3 - all is fine, 143Mi memory consumption. Looks like the problem is Kubernetes-implementation-specific.

Answer 5 · 2022-02-07T19:38:05.000Z

The point is that we can't reproduce it, even with your exact configuration file, so the chances of it being an issue with the image (and thus a relevant "bug" for this repository) are low.

What we mean by "issue with your config" is larger than just the exact configuration file and includes environmental details like the Kubernetes setup differences you've discovered since.

Answer 6 · 2022-02-08T02:01:54.000Z

@tianon what K8s implementation were you trying to reproduce in? Can you please kindly try kind v0.11.1. If you cannot reproduce in Kind with a basic 3-node setup either, we can close the ticket.

Answer 7 · 2022-02-18T02:23:16.000Z

Its related to kubernetes-sigs/kind#760

The applications seem to do big allocations up front of memory based on the amount of file descriptors they have available.

Seems that haproxy is one of such applications which hit this problem in Kind (other more known examples being NFS and mysql).