varnish-exporter resource consumption issues
timkante opened this issue ยท 19 comments
Describe the bug
We are running your latest image as a StatefulSet (without Helm), and manually rebuilt your master-state that includes the varnish-exporter.
We did the following steps to get there:
- Regular deployment of
kube-httpcache
(see config below) - We pushed a docker image that bundles the varnish exporter, which is a 1:1 copy of your version - also see the Dockerfile below
Everything is working seemingly fine, but the memory and CPU consumption of the exporter-container is steadily rising.
We increased its resource limits several times with no success, after a couple of days it always reaches its limits and will stop working.
It also does not produce any suspicious logs.
CPU Consumption (we had a short metrics outage on the system, and the drop at the end is a restart of the STS):
Expected behavior
A reasonably constant usage of resources.
Configuration
our StatefulSet-Config:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: varnish
namespace: gatekeeper
labels:
app: varnish
spec:
serviceName: varnish
# replicas: <KUSTOMIZE>
updateStrategy:
type: RollingUpdate
selector:
matchLabels:
app: varnish
template:
metadata:
labels:
app: varnish
spec:
imagePullSecrets:
- name: staffbase-registry
containers:
- name: varnish
image: quay.io/mittwald/kube-httpcache:v0.4.2
imagePullPolicy: IfNotPresent
args:
- -admin-addr=0.0.0.0
- -admin-port=6083
- -signaller-enable
- -signaller-port=8090
- -frontend-watch
- -frontend-namespace=$(NAMESPACE)
- -frontend-service=varnish
- -frontend-port=8080
- -backend-watch
- -backend-namespace=$(NAMESPACE)
- -backend-service=frontend-cache-service
- -varnish-secret-file=/etc/varnish/k8s-secret/secret
- -varnish-vcl-template=/etc/varnish/tmpl/default.vcl.tmpl
- -varnish-storage=malloc,256M
ports:
- containerPort: 8080
name: http
- containerPort: 8090
name: signaller
# resources:
# requests:
# memory: <KUSTOMIZE>
# cpu: <KUSTOMIZE>
# limits:
# memory: <KUSTOMIZE>
# cpu: <KUSTOMIZE>
env:
- name: NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: CACHE_INFO_HEADERS
# value: <KUSTOMIZE>
- name: VSM_NOPID
value: "true"
readinessProbe:
httpGet:
path: /healthz
port: http
initialDelaySeconds: 30
volumeMounts:
- name: template
mountPath: /etc/varnish/tmpl
- name: secret
mountPath: /etc/varnish/k8s-secret
- name: var
mountPath: /var/lib/varnish
securityContext:
runAsUser: 100
runAsGroup: 101
- name: exporter
securityContext:
runAsUser: 100
runAsGroup: 101
image: registry.staffbase.com/private/prometheus_varnish_exporter:1.0.0
imagePullPolicy: IfNotPresent
args:
- -web.health-path=/healthz
- -no-exit
- -verbose
env:
- name: VSM_NOPID
value: "true"
ports:
- name: metrics
containerPort: 9131
resources:
requests:
memory: 128Mi
cpu: 200m
limits:
memory: 256Mi
cpu: 400m
volumeMounts:
- name: var
mountPath: /var/lib/varnish
readinessProbe:
httpGet:
path: /healthz
port: metrics
initialDelaySeconds: 30
serviceAccountName: varnish
restartPolicy: Always
volumes:
- name: template
configMap:
name: vcl-template
- name: secret
secret:
secretName: varnish-secret
- name: var
emptyDir: { }
The exact docker image we build for the exporter:
FROM debian:stretch-slim
ENV EXPORTER_VERSION=1.6
WORKDIR /
RUN set -x \
&& \
apt-get -qq update && apt-get -qq upgrade \
&& \
apt-get -qq install \
debian-archive-keyring \
curl \
gnupg \
apt-transport-https \
&& \
curl -Ss -L https://packagecloud.io/varnishcache/varnish60lts/gpgkey | apt-key add - \
&& \
printf "%s\n%s" \
"deb https://packagecloud.io/varnishcache/varnish60lts/debian/ stretch main" \
"deb-src https://packagecloud.io/varnishcache/varnish60lts/debian/ stretch main" \
> "/etc/apt/sources.list.d/varnishcache_varnish60lts.list" \
&& \
apt-get -qq update && apt-get -qq install varnish \
&& \
apt-get -qq purge curl gnupg apt-transport-https && \
apt-get -qq autoremove && apt-get -qq autoclean && \
rm -rf /var/cache/*
RUN mkdir /exporter && \
chown varnish /exporter
ADD --chown=varnish https://github.com/jonnenauha/prometheus_varnish_exporter/releases/download/${EXPORTER_VERSION}/prometheus_varnish_exporter-${EXPORTER_VERSION}.linux-amd64.tar.gz /tmp
RUN cd /exporter && \
tar -xzf /tmp/prometheus_varnish_exporter-${EXPORTER_VERSION}.linux-amd64.tar.gz && \
ln -sf /exporter/prometheus_varnish_exporter-${EXPORTER_VERSION}.linux-amd64/prometheus_varnish_exporter prometheus_varnish_exporter
ENTRYPOINT [ "/exporter/prometheus_varnish_exporter" ]
Kubernetes Version: 1.18.15
Thanks for looking into this ๐ค ๐
There has not been any activity to this issue in the last 14 days. It will automatically be closed after 7 more days. Remove the stale
label to prevent this.
Sorry for the late response. The varnish exporter is a third-party binary that TBH I'm not particularly familiar with.
From other prometheus exporters, I know that metrics churn (i.e. lots of new time series added over time) is often a problem and suspect a similar issue in this case (especially since backend names are updated/rotated frequently with most VCL configurations). Could you have a look at the metrics exported by the exporter (especially after it's been running a while and memory consumption has risen by a significant degree)?
@martin-helmich I did a quick check and noticed the following two things:
Two consecutive scrapes update from (one specific entry picked here):
varnish_backend_bereq_bodybytes{backend="k8s-upstreamcfg-1096.be-frontend-cache-service-68598f57cd-ksfl2",server="unknown"} 0
to:
varnish_backend_bereq_bodybytes{backend="k8s-upstreamcfg-1100.be-frontend-cache-service-68598f57cd-ksfl2",server="unknown"} 0
This already indicates that many labels get created by the exporter, as the backend names constantly update due to the config-reloader.
I also noticed that varnish_main_n_backend 6606
is way higher than it should be, so I backed that with a quick Grafana query, which shows the rise of "known backends":
So I guess the amount of labels could indeed be a problem here (not only for the exporter, but also for prometheus), why do you think?
Can we as a solution maybe change the names that the config-reloader puts in the config to not be unique on every config reload?
So I guess the amount of labels could indeed be a problem here (not only for the exporter, but also for prometheus), why do you think?
Yes -- memory consumption of the exporter will increase (approximately) linearly with the number of metrics. AFAIK, Prometheus itself is a bit more optimized for high metric churn than it used to be in the past.
Can we as a solution maybe change the names that the config-reloader puts in the config to not be unique on every config reload?
IIRC, the k8s-upstreamcfg-1100
part of k8s-upstreamcfg-1096.be-frontend-cache-service-68598f57cd-ksfl2
(from your example above) is the VCL config name, not the backend name (which is just the Pod name). When reloading the config, the name simply increments since this way the easiest way to assert that a "new" config is defined and loaded.
I'm just going to dump some thoughts that come into my mind:
A possible "solution" might be to re-use previous VCL config names (however, off the top of my head, I don't know if the vcl.inline
command supports overwriting an existing VCL config); in theory, it might be enough to have one active and one inactive config that are updated and loaded alternatingly. -- Alternatively one might be able to convince the maintainer of the exporter itself to add a configuration option for not including the VCL config name at all in the backend
label.
However, Pod names will also change frequently and still cause metrics churn, even if the VCL names stay constant. This might (!) be solved by naming VCL backends deterministically (maybe just using indices, instead of their names):
{{ range $i, $b := .Backends }}
backend be-{{ $i }} {
.host = "{{ .Host }}";
.port = "{{ .Port }}";
}
{{- end }}
I'll give this issue some more thought, but cannot make any promises as to when I'll find the time to actually put some work into this. In the meantime, any help is welcome. ๐
is it normal that the config does get reloaded every other minute?
From the looks of your logs, the VCL template is updated quite frequently, which seems to trigger a configuration reload. IIRC, template changes are determined by watching the respective file using inotify.
Can you check if there are actual changes being made to said file? Alternatively, it might be worth adding a check to the controller to only trigger a config reload if the actual contents of the VCL template changed.
@martin-helmich I assume that in the code, there should be a check for the kind of event fswatch
is streaming.
The current implementation is: (https://github.com/mittwald/kube-httpcache/blob/master/pkg/watcher/template_watch.go#L17)
func (t *fsnotifyTemplateWatcher) watch(updates chan []byte, errors chan error) {
for ev := range t.watcher.Events {
glog.V(6).Infof("observed %s event on %s", ev.String(), ev.Name)
content, err := ioutil.ReadFile(t.filename)
if err != nil {
glog.Warningf("error while reading file %s: %s", t.filename, err.Error())
errors <- err
continue
}
updates <- content
}
}
So no matter what event, the file gets reloaded. This can cause issues if multiple processes access the file, and the access-time - for example - gets updated.
Maybe it would make sense to add a check like: (from the fsnotify
docs at: https://github.com/fsnotify/fsnotify#usage)
if ev.Op&fsnotify.Write == fsnotify.Write {
log.Println("modified file:", ev.Name)
}
For our case I switched to your poll implementation (by adding -varnish-vcl-template-poll
as program argument), as this explicitly calls os.stat
and only checks the modTime
of the file and reloads if it changed. This results in the expected behaviour - only reloading the file if changed.
And regarding the metrics-churn, here is what I found out so far:
The prometheus_varnish_exporter
has a special handling for the behaviour varnishreload
has when reloading VCLs. It names the VCL reload_<TIMESTAMP>
. Here is the function of the exporter, that discards the VCL name if it is a reloaded version: https://github.com/jonnenauha/prometheus_varnish_exporter/blob/86fc1b025dd41eaf43a583a262daec6cad69b561/prometheus.go#L219
However, to get this running, we cloned your repo and changed https://github.com/mittwald/kube-httpcache/blob/master/pkg/controller/watch.go#L80 to
configname := fmt.Sprintf("reload_%d", time.Now().UnixNano())
This will then correctly be discarded by the exporter:
In addition to that, defining a lot of inline VCLs, can be memory consuming, as varnish keeps the compiled VCLs and never discards them. You can inspect this by running varnishadm vcl.list
- for us, it was:
So I changed the run command of the STS container to:
containers:
- name: varnish
image: registry.staffbase.com/private/kube-httpcache:0.0.3
imagePullPolicy: IfNotPresent
command:
- /bin/bash
- -cxe
- |
set -eo pipefail
while true; do
varnishadm vcl.list \
| tr -s ' ' \
| cut -d ' ' -f 4 \
| head -n -6 \
| while read in; do varnishadm vcl.discard "$in" &>/dev/null; done
sleep 60;
done &
./kube-httpcache \
-admin-addr=0.0.0.0 \
-admin-port=6083 \
-signaller-enable \
-signaller-port=8090 \
-frontend-watch \
-frontend-namespace=$(NAMESPACE) \
-frontend-service=varnish-headless \
-frontend-port=8080 \
-backend-watch \
-backend-namespace=$(NAMESPACE) \
-backend-service=frontend-cache-service \
-varnish-secret-file=/etc/varnish/k8s-secret/secret \
-varnish-vcl-template=/etc/varnish/tmpl/default.vcl.tmpl \
-varnish-vcl-template-poll \
-varnish-storage=malloc,256M
ports:
- containerPort: 8080
name: http
- containerPort: 8090
name: signaller
Which launches a fork that discards every VCL, older than the most recent 5 VCLs.
Let's see if those steps already resolve our resource issue. ๐ค๐ผ
Hey Tim,
thanks for your research in this matter -- that's tremendously helpful and much appreciated. ๐
I think renaming the VCL configs to match the exporter's automatic cleanup mechanism should be a possible change to make without any nasty side effects. And maybe we can also adjust the controller to regularly discard old configurations... ๐ค
Awesome! I'm in to help you with the changes if you want ๐
Just some last concerns before we can start:
In case you still want to bundle the metrics-exporter in the docker image, we have to consider an issue that I came across: jonnenauha/prometheus_varnish_exporter#57
It has been fixed on their master (which is why I had to fork the repo and build an image for us by hand), so in case we make the change that the exporter will auto-discard the reload_
-prefix, we will run into this case, that the last published release of the exporter will produce the errors described in the issue.
What is your suggested way for this? Extend the documentation of your service with a section on how to run it with the exporter as sidecar (and require a user to get their own image of its master), or build it from source in the dockerfile?
Or do you even have an alternative option in mind?
There has not been any activity to this issue in the last 14 days. It will automatically be closed after 7 more days. Remove the stale
label to prevent this.
Hey everyone!
Since the helm option to include the exporter is publicly available, do you believe @martin-helmich it's possible to publish a new Docker image version that contains the exporter referenced in the Dockerfile
? ๐
For now, enabling the exporter
option results in the following error :
Warning Failed 0s (x3 over 23s) kubelet Error: failed to create containerd task: OCI runtime create failed: container_linux.go:370: starting container process caused: exec: "/exporter/prometheus_varnish_exporter": stat /exporter/prometheus_varnish_exporter: no such file or directory: unknown
Indeed it's not in the image:
docker run --entrypoint=bash -it quay.io/mittwald/kube-httpcache:stable
root@c14a649e7891:/# ls /
bin boot dev etc home kube-httpcache lib lib64 media mnt opt proc root run sbin srv sys tmp usr var
Both the stable
& v0.5.0
tags don't contain it (despite the release notes mentioning the addition)
I wouldn't recommend Publishing the current state of the Docker Image due to the issues I mentioned above. Those should be addressed before releasing the cache + exporter combo into the wild, as I'd not consider the current implementation as production-ready.
While I share your concerns in introducing a not ready feature in the stable
version of this project, I tend to believe that if the content of the public Docker image is not as advertised in the release and on the master
branch, it's even more disturbing to newcomers to this project ๐ฌ
Following, your earlier message, I wonder what could be the best way to move forward on this?
I'm willing to put effort in this as well if necessary ๐
@thylong I think findig a soulution to the question I had above, would enable us moving forward here:
In case you still want to bundle the metrics-exporter in the docker image, we have to consider an issue that I came across: jonnenauha/prometheus_varnish_exporter#57
It has been fixed on their master (which is why I had to fork the repo and build an image for us by hand), so in case we make the change that the exporter will auto-discard the reload_-prefix, we will run into this case, that the last published release of the exporter will produce the errors described in the issue.
What is your suggested way for this? Extend the documentation of your service with a section on how to run it with the exporter as sidecar (and require a user to get their own image of its master), or build it from source in the dockerfile?
Dou you have some thoughts on this?
VCL reload naming fix
IMHO, we should use your suggest fix of line [#80](https://github.com/mittwald/kube-httpcache/blob/master/pkg/controller/watch.go#L80 to) :
configname := fmt.Sprintf("reload_%d", time.Now().UnixNano())
It keeps things simple, doesn't alter at all the current behavior of kube-httpcache
, it just works like a charm.
VCL list increasing over time
I tend to agree with @martin-helmich on the discard.
Having the watcher discarding previous configurations progressively should be preferred to avoid running scripts in the background for that duty as we're in a containerized environment.
Here is a PR suggestion a fix : #95
Prometheus_varnish_exporter outdated
Do you even have an alternative option in mind?
I was about to suggest to reach out to the maintainer of prometheus_varnish_exporter to request a release that includes jonnenauha/prometheus_varnish_exporter#57but it seems that it has already been done a while ago with no success here... ๐
Second alternative option would be to fork ourselves the original repo and publish a release until the maintainer do so.
While I'm not often fan of forks, this would avoid to degrade Docker building times and kube-httpcache
maintainability
Preferred solutions
IMHO, a prometheus_varnish_exporter
fork through mittwald account + a Github release with the fix would keep the governance coherent on this project and helps to keep this project simple.
If we don't pick this option, I tend to prefer building the exporter as it will ensure the exporter
will work out of the box for people wanting to play with this project. It's a must have feature for production environments.
WDYT @timkante @martin-helmich ?
Hey @thylong, thanks for placing your thoughts on this โ๐ผ
I agree with both fixes which target this project's source (the config renaming and wiring the VCL removal into the application code).
Regarding the exporter
-release-issue, I think we should make another attempt to reach out to the owners and get them to create a release. Maybe we can comment on the existing, open issue, or create a new issue with a bit more content from scratch. ๐ฌ
I think we should give this another shot, as the owner of the exporter
seems to still be active (looking at his GitHub history).
If there is no reaction I'd agree with you, preferring the mittwald exporter Fork - backed by a from-source build if the owners don't like this idea ๐ฏ
Hey all! ๐ Thanks for all your thoughts on this issue -- it took me a while to catch up on this one. ๐
Regarding the VCL naming fix (using fmt.Sprintf("reload_%d", time.Now().UnixNano())
as VCL name)
Full ACK from me. Happy to accept PRs on this, in case anyone wants full attribution in the commit log, otherwise I might also pick up the proposed change from the comments myself.
Regarding the VCL discarding
Thanks for the PR, @thylong! I'll get on reviewing just after submitting this comment. ๐
Regarding the excessive VCL reloading mentioned in #81 (comment)
I was happy to note that this was already fixed by another contributor in #90. ๐ฅณ
Regarding the exporter release
If at all possible, I'd prefer not maintaining our own fork of a 3rd-party-repo. ๐ Maybe we can take another shot at pining the varnish-exporter maintainer. As a last resort, I'll look into forking the respective repo.
Regarding the missing exporter binary
I noticed that the "regular" Dockerfile contains the necessary steps for obtaining the exporter binary:
kube-httpcache/build/package/docker/Dockerfile
Lines 48 to 52 in 70b91d1
It's only the Dockerfile used for Goreleaser that's missing the exporter binary (I'm going to have to refresh my memory on why we're using two different Dockerfile in the first place ๐ค). I'll create I have created #96 as a follow-up issue for adjusting the Dockerfile accordingly (or better, see if we can get away with using a single Dockerfile instead of two separate ones).
๐บ We've got a new release: https://github.com/jonnenauha/prometheus_varnish_exporter/releases/tag/1.6.1