HAproxy pods in Kubernetes rapidly consume all available memory
victor-sudakov opened this issue · 7 comments
I'm trying to run haproxy in a local Kubernetes cluster (kind v0.11.1 go1.17.6 linux/amd64).
Kubernetes version: Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.1", GitCommit:"5e58841cce77d4bc13713ad2b91fa0d961e69192", GitTreeState:"clean", BuildDate:"2021-05-21T23:01:33Z", GoVersion:"go1.16.4", Compiler:"gc", Platform:"linux/amd64"}
I've tried different images: haproxy:2.0.27-buster, haproxy:2.0.27-alpine, haproxy:2.5.1-alpine, haproxy:2.5.1-bullseye and some others, and the result is always the same: the haproxy pod rapidly (!) consumes as much memory as limits permit (and if there is no limit configured, it easily consumes gigabytes of memory). Below please find my test configurations:
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: haproxy
name: haproxy
spec:
replicas: 1
selector:
matchLabels:
app: haproxy
template:
metadata:
labels:
app: haproxy
spec:
containers:
- name: haproxy
image: haproxy:2.0.27-buster
resources:
requests:
memory: "256Mi"
limits:
memory: "512Mi"
volumeMounts:
- name: config
mountPath: /usr/local/etc/haproxy/haproxy.cfg
subPath: haproxy.cfg
volumes:
- name: config
configMap:
name: haproxy
global
log stdout format raw local0 info
defaults
timeout client 60s
timeout connect 3s
timeout server 60s
timeout check 3s
maxconn 100
frontend acl
bind *:10001
mode tcp
default_backend acl
frontend rm
bind *:10002
mode tcp
default_backend rm
backend acl
mode tcp
option pgsql-check user acl_app
server pgsql-test-acl pgsql-test-acl:5432 weight 100 check inter 10000
backend rm
mode tcp
option pgsql-check user rm_app
server pgsql-test-rm pgsql-test-rm:5432 weight 100 check inter 10000
Probably something with your config, I'm not able to reproduce (although the server being forwarded to isn't there)
haproxy.cfg
global
log stdout format raw local0 info
defaults
timeout client 60s
timeout connect 3s
timeout server 60s
timeout check 3s
maxconn 100
frontend acl
bind *:10001
mode tcp
default_backend acl
frontend rm
bind *:10002
mode tcp
default_backend rm
backend acl
mode tcp
option pgsql-check user acl_app
server pgsql-test-acl 127.0.0.1:80 weight 100 check inter 10000
backend rm
mode tcp
option pgsql-check user rm_app
server pgsql-test-rm 127.0.0.1:80 weight 100 check inter 10000
$ kubectl create configmap haproxy --from-file=./haproxy.cfg
configmap/haproxy created
$ kubectl apply -f haproxy.yaml
deployment.apps/haproxy created
$ kubectl get po -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
haproxy-868b79979b-7wfzv 1/1 Running 0 28s 10.42.0.13 demo <none> <none>
$ kubectl top po haproxy-868b79979b-7wfzv
NAME CPU(cores) MEMORY(bytes)
haproxy-868b79979b-7wfzv 1m 67Mi
$ kubectl logs haproxy-868b79979b-7wfzv
[NOTICE] 034/181541 (1) : New worker #1 (8) forked
[WARNING] 034/181542 (8) : Server acl/pgsql-test-acl is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
[ALERT] 034/181542 (8) : backend 'acl' has no server available!
[WARNING] 034/181546 (8) : Server rm/pgsql-test-rm is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
[ALERT] 034/181546 (8) : backend 'rm' has no server available!
You could also try asking over at the Docker Community Forums, Docker Community Slack, or Stack Overflow. Since these repos aren't really a user-help forum
Probably something with your config, I'm not able to reproduce (although the server being forwarded to isn't there)
If you cannot reproduce it with my config, obviously there is nothing wrong with my config :-) What flavour/version of Kubernetes were you trying to reproduce in?
On Kubernetes v1.23.1 (kind v0.11.1) the problem persists. And no, my issue is not a question, it's a bug report. HAproxy outside K8s does not display this behavior.
minikube v1.24.0 Kubernetes v1.22.3 - all is fine, 143Mi memory consumption. Looks like the problem is Kubernetes-implementation-specific.
The point is that we can't reproduce it, even with your exact configuration file, so the chances of it being an issue with the image (and thus a relevant "bug" for this repository) are low.
What we mean by "issue with your config" is larger than just the exact configuration file and includes environmental details like the Kubernetes setup differences you've discovered since.
@tianon what K8s implementation were you trying to reproduce in? Can you please kindly try kind v0.11.1. If you cannot reproduce in Kind with a basic 3-node setup either, we can close the ticket.
Its related to kubernetes-sigs/kind#760
The applications seem to do big allocations up front of memory based on the amount of file descriptors they have available.
Seems that haproxy is one of such applications which hit this problem in Kind (other more known examples being NFS and mysql).