Add ability to show created/evicted conntrack entries
Opened this issue · 1 comments
I'm not sure if this is even possible as I didn't look into implementation of the CT reporting, but would be nice to see evicted CT entries when passing a packet through conntrack is causing it. It's a little bit of a niche use case, but I was debugging it the other day and it wasn't clear from the retis alone what is happening. So, I had two options: bisect the kernel or crowd-source the answer in hope that other people can give me a clue.
Context: https://mail.openvswitch.org/pipermail/ovs-dev/2024-March/412183.html
Example:
28421906307619 [python3] 48409 [tp] unknow:ovs_do_execute_action #19d97da0d38effff98078a2b3800 (skb 18446629757626762976)
if 178 (ovs-p1) rxif 178 10.1.1.2.80 > 10.1.1.1.41034 ttl 64 tos 0x0 id 64862 off 0 [DF] len 1339 proto TCP (6) flags [P.] seq 417102736:417104023 ack 1397946176 win 509
[recirc_id 0x2] exec oport 2
ct_state REPLY tcp (ESTABLISHED) orig [10.1.1.1.41034 > 10.1.1.2.80] reply [10.1.1.2.80 > 10.1.1.240.34568] zone 1
28421906330968 [python3] 48409 [tp] unknow:ovs_do_execute_action #19d97da15899ffff98078a2b4800 (skb 18446629757068805888)
if 176 (ovs-p0) rxif 176 10.1.1.1.41034 > 10.1.1.2.80 ttl 64 tos 0x0 id 25861 off 0 [DF] len 52 proto TCP (6) flags [.] seq 1397946176 ack 417104023 win 501
exec ct zone 1 nat
28421906334541 [python3] 48409 [tp] unknow:ovs_do_execute_action #19d97da15899ffff98078a2b4800 (skb 18446629757068805888)
if 176 (ovs-p0) rxif 176 10.1.1.240.34568 > 10.1.1.2.80 ttl 64 tos 0x0 id 25861 off 0 [DF] len 52 proto TCP (6) flags [.] seq 1397946176 ack 417104023 win 501
exec recirc 0x1
ct_state ESTABLISHED tcp (ESTABLISHED) orig [10.1.1.1.41034 > 10.1.1.2.80] reply [10.1.1.2.80 > 10.1.1.240.34568] zone 1
28421906339108 [python3] 48409 [tp] unknow:ovs_do_execute_action #19d97da15899ffff98078a2b4800 (skb 18446629757068805888)
if 176 (ovs-p0) rxif 176 10.1.1.240.34568 > 10.1.1.2.80 ttl 64 tos 0x0 id 25861 off 0 [DF] len 52 proto TCP (6) flags [.] seq 1397946176 ack 417104023 win 501
[recirc_id 0x1] exec oport 3
ct_state ESTABLISHED tcp (ESTABLISHED) orig [10.1.1.1.41034 > 10.1.1.2.80] reply [10.1.1.2.80 > 10.1.1.240.34568] zone 1
28421906411939 [python3] 48409 [tp] unknow:ovs_do_execute_action #19d97da28900ffff98078a2b4800 (skb 18446629757626762976)
if 178 (ovs-p1) rxif 178 10.1.1.2.80 > 10.1.1.240.34568 ttl 64 tos 0x0 id 64863 off 0 [DF] len 52 proto TCP (6) flags [F.] seq 417104023 ack 1397946176 win 509
exec ct zone 1 nat
28421906419909 [python3] 48409 [tp] unknow:ovs_do_execute_action #19d97da28900ffff98078a2b4800 (skb 18446629757626762976)
if 178 (ovs-p1) rxif 178 10.1.1.2.80 > 10.1.1.1.41034 ttl 64 tos 0x0 id 64863 off 0 [DF] len 52 proto TCP (6) flags [F.] seq 417104023 ack 1397946176 win 509
exec recirc 0x2
ct_state REPLY tcp (FIN_WAIT) orig [10.1.1.1.41034 > 10.1.1.2.80] reply [10.1.1.2.80 > 10.1.1.240.34568] zone 1
28421906424777 [python3] 48409 [tp] unknow:ovs_do_execute_action #19d97da28900ffff98078a2b4800 (skb 18446629757626762976)
if 178 (ovs-p1) rxif 178 10.1.1.2.80 > 10.1.1.1.41034 ttl 64 tos 0x0 id 64863 off 0 [DF] len 52 proto TCP (6) flags [F.] seq 417104023 ack 1397946176 win 509
[recirc_id 0x2] exec oport 2
ct_state REPLY tcp (FIN_WAIT) orig [10.1.1.1.41034 > 10.1.1.2.80] reply [10.1.1.2.80 > 10.1.1.240.34568] zone 1
28421906715556 [wget] 48440 [tp] unknow:ovs_do_execute_action #19d97da728bcffff980787e7d800 (skb 18446629756927422688)
if 176 (ovs-p0) rxif 176 10.1.1.1.41034 > 10.1.1.2.80 ttl 64 tos 0x0 id 25862 off 0 [DF] len 52 proto TCP (6) flags [F.] seq 1397946176 ack 417104024 win 501
exec ct zone 1 nat
28421906719877 [wget] 48440 [tp] unknow:ovs_do_execute_action #19d97da728bcffff980787e7d800 (skb 18446629756927422688)
if 176 (ovs-p0) rxif 176 10.1.1.240.34568 > 10.1.1.2.80 ttl 64 tos 0x0 id 25862 off 0 [DF] len 52 proto TCP (6) flags [F.] seq 1397946176 ack 417104024 win 501
exec recirc 0x1
ct_state ESTABLISHED tcp (LAST_ACK) orig [10.1.1.1.41034 > 10.1.1.2.80] reply [10.1.1.2.80 > 10.1.1.240.34568] zone 1
28421906725459 [wget] 48440 [tp] unknow:ovs_do_execute_action #19d97da728bcffff980787e7d800 (skb 18446629756927422688)
if 176 (ovs-p0) rxif 176 10.1.1.240.34568 > 10.1.1.2.80 ttl 64 tos 0x0 id 25862 off 0 [DF] len 52 proto TCP (6) flags [F.] seq 1397946176 ack 417104024 win 501
[recirc_id 0x1] exec oport 3
ct_state ESTABLISHED tcp (LAST_ACK) orig [10.1.1.1.41034 > 10.1.1.2.80] reply [10.1.1.2.80 > 10.1.1.240.34568] zone 1
28421906751251 [wget] 48440 [tp] unknow:kfree_skb #19d97da728bcffff980787e7d800 (skb 18446629756927422688) drop (NOT_SPECIFIED)
10.1.1.240.34568 > 10.1.1.2.80 ttl 64 tos 0x0 id 25862 off 0 [DF] len 52 proto TCP (6) flags [F.] seq 1397946176 ack 417104024 win 501
28421906763185 [wget] 48440 [tp] unknow:ovs_do_execute_action #19d97da8071cffff980787e7a400 (skb 18446629757021277952)
if 178 (ovs-p1) rxif 178 10.1.1.2.80 > 10.1.1.240.34568 ttl 64 tos 0x0 id 64864 off 0 [DF] len 52 proto TCP (6) flags [.] seq 417104024 ack 1397946177 win 509
exec ct zone 1 nat
28421906765247 [wget] 48440 [tp] unknow:ovs_do_execute_action #19d97da8071cffff980787e7a400 (skb 18446629757021277952)
if 178 (ovs-p1) rxif 178 10.1.1.2.80 > 10.1.1.1.41034 ttl 64 tos 0x0 id 64864 off 0 [DF] len 52 proto TCP (6) flags [.] seq 417104024 ack 1397946177 win 509
exec recirc 0x2
ct_state REPLY tcp (TIME_WAIT) orig [10.1.1.1.41034 > 10.1.1.2.80] reply [10.1.1.2.80 > 10.1.1.240.34568] zone 1
28421906769285 [wget] 48440 [tp] unknow:ovs_do_execute_action #19d97da8071cffff980787e7a400 (skb 18446629757021277952)
if 178 (ovs-p1) rxif 178 10.1.1.2.80 > 10.1.1.1.41034 ttl 64 tos 0x0 id 64864 off 0 [DF] len 52 proto TCP (6) flags [.] seq 417104024 ack 1397946177 win 509
[recirc_id 0x2] exec oport 2
ct_state REPLY tcp (TIME_WAIT) orig [10.1.1.1.41034 > 10.1.1.2.80] reply [10.1.1.2.80 > 10.1.1.240.34568] zone 1
==============================================
>>>>>> A new connection is started here <<<<<<
==============================================
28422716682286 [swapper/9] 0 [tp] unknow:kfree_skb #19d9adee702effff9807af5d4800 (skb 18446629756970724864) drop (UNHANDLED_PROTO)
if 2 (eth0) rxif 2
28422942112924 [wget] 48455 [tp] unknow:ovs_do_execute_action #19d9bb5e12f1ffff9807b6320400 (skb 18446629756985994464)
if 176 (ovs-p0) rxif 176 10.1.1.1.58242 > 10.1.1.2.80 ttl 64 tos 0x0 id 46668 off 0 [DF] len 60 proto TCP (6) flags [S] seq 2268670383 win 64240
exec ct zone 1 nat
28422942119642 [wget] 48455 [tp] unknow:ovs_do_execute_action #19d9bb5e12f1ffff9807b6320400 (skb 18446629756985994464)
if 176 (ovs-p0) rxif 176 10.1.1.1.58242 > 10.1.1.2.80 ttl 64 tos 0x0 id 46668 off 0 [DF] len 60 proto TCP (6) flags [S] seq 2268670383 win 64240
exec recirc 0x1
ct_state NEW tcp (SYN_SENT) orig [10.1.1.1.58242 > 10.1.1.2.80] reply [10.1.1.2.80 > 10.1.1.1.58242] zone 1
28422942125733 [wget] 48455 [tp] unknow:ovs_do_execute_action #19d9bb5e12f1ffff9807b6320400 (skb 18446629756985994464)
if 176 (ovs-p0) rxif 176 10.1.1.1.58242 > 10.1.1.2.80 ttl 64 tos 0x0 id 46668 off 0 [DF] len 60 proto TCP (6) flags [S] seq 2268670383 win 64240
[recirc_id 0x1] exec ct zone 1 nat(src=10.1.1.240:34568) commit
ct_state NEW tcp (SYN_SENT) orig [10.1.1.1.58242 > 10.1.1.2.80] reply [10.1.1.2.80 > 10.1.1.1.58242] zone 1
===================================================================================
>>>>>> And it SNATed to exaclty the same ip:port (10.1.1.240.34568) evicting <<<<<<
>>>>>> the previous TIME_WAIT entry and adding a new SYN_SENT entry. <<<<<<
===================================================================================
28422942132420 [wget] 48455 [tp] unknow:ovs_do_execute_action #19d9bb5e12f1ffff9807b6320400 (skb 18446629756985994464)
if 176 (ovs-p0) rxif 176 10.1.1.240.34568 > 10.1.1.2.80 ttl 64 tos 0x0 id 46668 off 0 [DF] len 60 proto TCP (6) flags [S] seq 2268670383 win 64240
[recirc_id 0x1] exec oport 3
ct_state NEW tcp (SYN_SENT) orig [10.1.1.1.58242 > 10.1.1.2.80] reply [10.1.1.2.80 > 10.1.1.240.34568] zone 1
28422942167792 [wget] 48455 [tp] unknow:ovs_do_execute_action #19d9bb5efd79ffff9807b6327000 (skb 18446629756921123328)
if 178 (ovs-p1) rxif 178 10.1.1.2.80 > 10.1.1.240.34568 ttl 64 tos 0x0 id 0 off 0 [DF] len 60 proto TCP (6) flags [S.] seq 433319049 ack 2268670384 win 65160
exec ct zone 1 nat
28422942170382 [wget] 48455 [tp] unknow:ovs_do_execute_action #19d9bb5efd79ffff9807b6327000 (skb 18446629756921123328)
if 178 (ovs-p1) rxif 178 10.1.1.2.80 > 10.1.1.1.58242 ttl 64 tos 0x0 id 0 off 0 [DF] len 60 proto TCP (6) flags [S.] seq 433319049 ack 2268670384 win 65160
exec recirc 0x2
ct_state REPLY tcp (SYN_RECV) orig [10.1.1.1.58242 > 10.1.1.2.80] reply [10.1.1.2.80 > 10.1.1.240.34568] zone 1
28422942174719 [wget] 48455 [tp] unknow:ovs_do_execute_action #19d9bb5efd79ffff9807b6327000 (skb 18446629756921123328)
if 178 (ovs-p1) rxif 178 10.1.1.2.80 > 10.1.1.1.58242 ttl 64 tos 0x0 id 0 off 0 [DF] len 60 proto TCP (6) flags [S.] seq 433319049 ack 2268670384 win 65160
[recirc_id 0x2] exec oport 2
ct_state REPLY tcp (SYN_RECV) orig [10.1.1.1.58242 > 10.1.1.2.80] reply [10.1.1.2.80 > 10.1.1.240.34568] zone 1
28422942197714 [wget] 48455 [tp] unknow:ovs_do_execute_action #19d9bb5f7439ffff9807b6320400 (skb 18446629756921120512)
if 176 (ovs-p0) rxif 176 10.1.1.1.58242 > 10.1.1.2.80 ttl 64 tos 0x0 id 46669 off 0 [DF] len 52 proto TCP (6) flags [.] seq 2268670384 ack 433319050 win 502
exec ct zone 1 nat
28422942199922 [wget] 48455 [tp] unknow:ovs_do_execute_action #19d9bb5f7439ffff9807b6320400 (skb 18446629756921120512)
if 176 (ovs-p0) rxif 176 10.1.1.240.34568 > 10.1.1.2.80 ttl 64 tos 0x0 id 46669 off 0 [DF] len 52 proto TCP (6) flags [.] seq 2268670384 ack 433319050 win 502
exec recirc 0x1
ct_state ESTABLISHED tcp (ESTABLISHED) orig [10.1.1.1.58242 > 10.1.1.2.80] reply [10.1.1.2.80 > 10.1.1.240.34568] zone 1
28422942203845 [wget] 48455 [tp] unknow:ovs_do_execute_action #19d9bb5f7439ffff9807b6320400 (skb 18446629756921120512)
if 176 (ovs-p0) rxif 176 10.1.1.240.34568 > 10.1.1.2.80 ttl 64 tos 0x0 id 46669 off 0 [DF] len 52 proto TCP (6) flags [.] seq 2268670384 ack 433319050 win 502
[recirc_id 0x1] exec oport 3
ct_state ESTABLISHED tcp (ESTABLISHED) orig [10.1.1.1.58242 > 10.1.1.2.80] reply [10.1.1.2.80 > 10.1.1.240.34568] zone 1
This is a new kernel behavior introduced in https://lore.kernel.org/netdev/20230626064749.75525-7-pablo@netfilter.org/ .
AFAIU, retis just prints out the ct entry attached to the skb. But it would be nice to capture the moment the old CT entry is evicted because of the packet we're tracing right now.
Maybe just having CT events in general in the trace would be helpful, even if they are not related to the currently tracked packet, as long as they do not obscure the output too much, e.g. by subscribing to events via a netlink group or some kernel probes if there are any.
Thanks for the suggestion Ilya, I think we discussed something similar when we were working on the ct
module.
IMHO, what we should avoid is another filtering syntax for ct events or unfiltered events.
At the time, I looked and I don't remember finding any nf_conntrack tracepoint so we could:
- Use
kprobe:ctnetlink_conntrack_event
and hope it doesn't change too much. - Do the filtering in userspace: Keep skb_tracking -> nf_conn data map as we receive nf_conn events, read conntrack events from netlink and filter only on connections that we have already seen. Doing this will surely be less accurate.