Conversion of `InnerFlowId` to SDT args appears expensive
Closed this issue · 1 comments
We seem to be spending around 13% of Port::process
by calling flow_id_sdt_arg::from
for the benefit of its many dtrace probes -- many times per packet, as we have pre/post-processing and per-layer probes. We should either cache this, or find an adequate #[repr(C)]
for InnerFlowId so that we don't need to perform a conversion.
![image](https://private-user-images.githubusercontent.com/6815381/306712362-9a41e3b0-f5cd-43e1-9c20-75c844f0c033.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjAyMjAwMTEsIm5iZiI6MTcyMDIxOTcxMSwicGF0aCI6Ii82ODE1MzgxLzMwNjcxMjM2Mi05YTQxZTNiMC1mNWNkLTQzZTEtOWMyMC03NWM4NDRmMGMwMzMucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI0MDcwNSUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNDA3MDVUMjI0ODMxWiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9ZmJhZDhiYWE2YTQzMDFkNzFjNWVlZjBlZDNlMDA4MmQ3ZDE4N2IwNjFlNWEyMGJmYzI4M2JkMDk5ZmI4MjI5YSZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QmYWN0b3JfaWQ9MCZrZXlfaWQ9MCZyZXBvX2lkPTAifQ.cNdw4yKufMIRkwnFLBui2QEfkZC28L91Xrx87nBW58Q)
I spent some time noodling on this before now, and returned to it today. Removing all calls to flow_id_sdt_arg
and port_process_return_probe
(#460) gives us ballpark xde_rx processing times like the below client-to-server iperf (using raw output to save time):
Before | After |
---|---|
|
|
So these are very worthwhile (in combination with #460), but I'm seeing some bizarreness in two locations when using either a valid #[repr(C)]
flow ID or e.g. passing a null ptr for the flow_id_arg
to the SDT. Calling either __dtrace_probe_uft__tcp__closed
or __dtrace_probe_flow__expired
causes a kernel panic due to a null-ptr (or other bad address) dereference several functions up (e.g., in update_tcp_entry
when trying to match on &TcpDirection
), but only when these probes are called -- and not on all codepaths. Register values for supposed pointers also contain values like 2
, which smells like an enum discriminant to me. I'll need to look into whether black_box
ing the probe calls will prevent whatever weirdness is happening.