At work, I noticed some throttling in some of our key data services we run in Kubernetes. Although I think the root cause was having too low of a cpu request set on those, I came to learn that there was a kernel level issue making this worse.
I first learned about it here: CFS quotas can lead to unnecessary throttling
I believe this was eventually patched in later kernel versions, but may still affect older systems.
I found a simple test case to check for this, thanks to Ivan Babrou.
This repo contains a Dockerized version of this very helpful post: Overly aggressive CFS
The high level summary here is, we run a test case which has a well defined work period of 5ms and should not experience throttling.
If your kernel is affected by this problem, you'll see:
- When no cpu quotas are used, it runs as expected and the main work loop completes consistently in about 5ms.
- When cpu quotas are used, the main work loop randomly takes around 100ms to complete instead. (And I've seen it in occur in long bursts.)
That's something like a random 20x drop in performance for unclear reasons and this is for a pretty clean example. In more complicated, multiprocess applications like RabbitMQ, I'm not sure how bad this gets.
docker run --rm -it seanshahkarami/cfs-quota-bug-test -iterations 100 -sleep 100ms
docker run --rm -it seanshahkarami/cfs-quota-bug-test -iterations 100 -sleep 1000ms
docker run --rm -it --cpu-quota 20000 --cpu-period 100000 seanshahkarami/cfs-quota-bug-test -iterations 100 -sleep 100ms
docker run --rm -it --cpu-quota 20000 --cpu-period 100000 seanshahkarami/cfs-quota-bug-test -iterations 100 -sleep 1000ms
docker run --rm -it --cpu-quota 20000 --cpu-period 100000 seanshahkarami/cfs-quota-bug-test -iterations 10 -sleep 5000ms
The single job can be run using:
# test without quotas
kubectl apply -f job-without-quota.yaml
# test with quotas
kubectl apply -f job-with-quota.yaml
In a multinode cluster, you can also run the test on every node using:
# test without quotas
kubectl apply -f ds-without-quota.yaml
# test with quotas
kubectl apply -f ds-with-quota.yaml