Huge Cpu on deleting/updating decissions
ne20002 opened this issue · 20 comments
I give the 0.0.27 a new try on my BPi-R3.
It seems as if updating the ipset or deleting ips from the ipset is using relativly huge amount of cpu.
root@BPiR3:/etc/config# cat /var/log/crowdsec-firewall-bouncer.log
time="16-08-2023 18:05:21" level=info msg="backend type : nftables"
time="16-08-2023 18:05:21" level=info msg="nftables initiated"
time="16-08-2023 18:05:21" level=info msg="Using API key auth"
time="16-08-2023 18:05:21" level=info msg="Processing new and deleted decisions . . ."
time="16-08-2023 18:05:30" level=info msg="1250 decisions deleted"
time="16-08-2023 18:05:30" level=info msg="36551 decisions added"
time="16-08-2023 18:11:17" level=info msg="2 decisions deleted"
time="16-08-2023 18:15:22" level=info msg="2 decisions deleted"
time="16-08-2023 18:15:46" level=info msg="2 decisions deleted"
time="16-08-2023 18:20:30" level=info msg="1 decision added"
time="16-08-2023 18:26:31" level=info msg="1 decision deleted"
time="16-08-2023 18:27:29" level=info msg="2 decisions deleted"
First spikes at 20:05 (aka 18:05), second spikes at 20:10 relate to deleting just two ips from set. Increase of thermal verifies the cpu usage spikes.
System: BananaPi R3 with Openwrt 25.03-rc2 and Openwrt crowdsec-firewall-bouncer package 0.0.27.
The system is handling 20Mbit/s traffic (IPTV) with just 1% cpu for the time displayed. With starting the bouncer the increase in cpu and temperature started.
Maybe there is some room for optimizations?
Adding bigger amounts of ips (like another 16033 decissions) is creating a smaller spike. So I wonder if there are two different ways used for adding small or bigger portions of ips.
Maybe there is some room for optimizations?
Do you have any recommendations? we already split into smaller batches only get IP based decisions. Any feedback you can provide we can explore.
EDIT: are you subscribed to any third party blocklists via the console?
Edit edit: I realised that the chunking optimisation is version 0.0.28 and I saw your issue with armv8 guessing you can test it without that version?
So I have been doing some testing for a potential bug in crowdsec but at the same time thought I test nftables. I see most of CPU usage is originating from the nft binary that is called when inserting/updating/deleting so at this moment it might be the way nftables handles the update causing the spike.
Update: turning off Prometheus seems to solve the cpu usages for me
Update update: it seems the metrics inspects the chains which can provide a massive json payload and doing this every 10 seconds was the problem for me..... hmmmm
Maybe we should do #314 for sanity
Hi
yes, I wanted the ARMv8 so I can test with my BananaPi R3. I currently have 0.0.27 running. If there is optimizations in 0.0.28 I'd be happy to test it. :)
And I have one ip list subscribed: tor exit nodes.
Regarding the metrics, this is my config.yaml:
root@BPiR3:~# cat /var/etc/crowdsec/bouncers/crowdsec-firewall-bouncer.yaml
mode: nftables
pid_dir: /var/run/
update_frequency: 10s
daemonize: true
log_mode: file
log_dir: /var/log/
log_level: info
log_compression: true
log_max_size: 100
log_max_backups: 3
log_max_age: 30
api_url: http://cs-lapi:8014/
api_key: *****************************
insecure_skip_verify: true
disable_ipv6: false
deny_action: drop
deny_log: false
supported_decisions_type:
- ban
#to change log prefix
deny_log_prefix: "crowdsec: "
#to change the blacklists name
blacklists_ipv4: crowdsec-blacklists
blacklists_ipv6: crowdsec6-blacklists
#type of ipset to use
ipset_type: nethash
#if present, insert rule in those chains
iptables_chains:
- INPUT
# - FORWARD
# - DOCKER-USER
## nftables
nftables:
ipv4:
enabled: true
set-only: true
table: crowdsec
chain: input
ipv6:
enabled: true
set-only: true
table: crowdsec6
chain: input
# packet filter
pf:
# an empty disables the anchor
anchor_name: ""
prometheus:
enabled: false
listen_addr: 127.0.0.1
listen_port: 60601
Hi @ne20002 ,
as I said in the other ticket the arm64 version should work for v8, I don't recall specific changes for performance in nftables 0.0.27->0.0.28 but there was a refactoring and dependency updates so I'd be interested to know how it works for you.
nevermind, I see now the note about prometheus. The next -rc4 will disable the metrics by default as well
Their configuration already had it disabled so it wouldnt be a driving factor, however, I am also interested in the results from 27 -> 28
I'll give it a try tomorrow.
Also, Openwrt 23.05.0-rc3 is announced with a nftables update.
For those interessted in perfomrmace info, this is a BananaPi R3 with crowdsec-firewall-bouncer 0.0.27 Openwrt package on Openwrt 23.05.0-rc2, info for the last two hours:
time="22-08-2023 06:09:14" level=info msg="1 decision added"
time="22-08-2023 06:12:19" level=info msg="1 decision deleted"
time="22-08-2023 06:12:47" level=info msg="1 decision deleted"
time="22-08-2023 06:13:17" level=info msg="1 decision deleted"
time="22-08-2023 06:34:33" level=info msg="26 decisions deleted"
time="22-08-2023 06:34:57" level=info msg="16105 decisions added"
time="22-08-2023 06:37:28" level=info msg="1 decision deleted"
time="22-08-2023 06:38:28" level=info msg="1 decision deleted"
time="22-08-2023 07:08:05" level=info msg="1 decision added"
time="22-08-2023 07:08:52" level=info msg="1 decision deleted"
time="22-08-2023 07:34:37" level=info msg="failed to flush chunk of 25 elements, will retry each one: conn.Receive: netlink receive: no such file or directory"
time="22-08-2023 07:34:47" level=info msg="25 decisions deleted"
time="22-08-2023 07:54:20" level=info msg="1 decision added"
time="22-08-2023 07:57:13" level=info msg="1 decision added"
time="22-08-2023 08:01:01" level=info msg="1 decision added"
Time is 2h offest due to timezone.
And this is now with the 0.0.28-rc3:
time="22-08-2023 08:41:20" level=info msg="Starting crowdsec-firewall-bouncer v0.0.28-rc3-5afbe527ad361e578fb1bdf523a2c34c31258149"
time="22-08-2023 08:41:20" level=info msg="backend type : nftables"
time="22-08-2023 08:41:20" level=info msg="nftables initiated"
time="22-08-2023 08:41:20" level=info msg="Using API key auth"
time="22-08-2023 08:41:20" level=info msg="Processing new and deleted decisions . . ."
time="22-08-2023 08:41:35" level=info msg="1110 decisions deleted"
time="22-08-2023 08:41:42" level=info msg="38361 decisions added"
time="22-08-2023 08:50:57" level=info msg="1 decision added"
time="22-08-2023 08:57:08" level=info msg="1 decision deleted"
time="22-08-2023 08:58:23" level=info msg="1 decision added"
time="22-08-2023 09:07:45" level=info msg="1 decision added"
time="22-08-2023 09:18:01" level=info msg="1 decision deleted"
time="22-08-2023 09:29:28" level=info msg="1 decision deleted"
time="22-08-2023 09:34:35" level=info msg="failed to flush chunk of 15 elements, will retry each one: conn.Receive: netlink receive: no such file or directory"
time="22-08-2023 09:34:55" level=info msg="15 decisions deleted"
time="22-08-2023 09:40:11" level=info msg="1 decision deleted"
time="22-08-2023 09:51:04" level=info msg="1 decision deleted"
time="22-08-2023 09:57:34" level=info msg="1 decision deleted"
time="22-08-2023 10:00:14" level=info msg="1 decision added"
time="22-08-2023 10:17:28" level=info msg="1 decision added"
time="22-08-2023 10:23:07" level=info msg="1 decision added"
Hi @ne20002
thanks for the testing. It looks like there is a regression but I can't justify it from the changes between the two versions. The biggest change was from rc1 and rc2, where we updated the google/nftables library. We should see if there's a difference in behavior between these.
We are preparing a test bench but don't have machines with 2GB, the smallest is a Pi4 with 4GB.
For what I see is that adding a huge number of ips to a (intially empty) set does not seem to consume more processing than just adding or deleting single ips.
I also see the bigger spike at 10:57 (and 10:58). According to the log there is two insert/deletions of a single ip near together. Is it possible that there are update operations on the set done in parallel causing extra processing overhaed?
Maybe this one shows it more clear:
time="22-08-2023 15:34:47" level=info msg="13 decisions deleted"
time="22-08-2023 16:17:07" level=info msg="1 decision added"
time="22-08-2023 16:34:33" level=info msg="20 decisions deleted"
time="22-08-2023 16:34:40" level=info msg="14994 decisions added"
time="22-08-2023 17:17:07" level=info msg="1 decision deleted"
The update of more decissions causes less processor usage. It's kinda interessting ..
Hi @ne20002
I have a crowdsec binary that fixes the issue for me, raising the batch size for decision operations. I can send you the link to test. Armv8, right?
Then if I understand correctly you are the maintainer for the openwrt port, there are a couple of changes required for building v1.5.3 that we should discuss, but not in this repository
As written on discourse.crowdsec.net I don't run the local api on Arm but in podman pod on amd64 with image from hub.docker.com.
Shall I use a dev image to test this?
Hi!
No need to test anymore, the changes have been merged for 1.5.3 which is out next week. So if it improves performance - good, otherwise we'll see for 1.5.4
Hi @mmetc
I'm now running 1.5.4 and 0.28-rc5 but the cpu spikes have only slightly reduced. Anyway, it is not really a problem (doing processing is consuming resources though).
I also still have a few entries in log like this:
time="22-09-2023 07:53:41" level=info msg="failed to flush chunk of 20 elements, will retry each one: conn.Receive: netlink receive: no such file or directory"
As I understand this happens when you try to delete entries which are already purged by nftables as all those entries are added with a ttl. I just had a thought of maybe switching from removing an ip entry to a (re)add the entry with a really small ttl (like one second) way to get rid of the warnings (and the need for a second processing doing then one by one).
It depends on the idea that if an entry is in the set with a higher ttl the ttl will be reduced when you readd it with a smaller ttl.
Maybe it's worth a try?
That's exactly what happens,
Really? You are actually not removing entries from the set but already readd them with small ttl? Why would that sometimes fail and require a one by one second try?