zeek/spicy

Avoid copy of data for UDP parsers

Closed this issue · 1 comments

Chatting with @rsmmr , one idea came up to prevent copying data for UDP/block analyzers:

auto input = hilti::rt::reference::make_value<hilti::rt::Stream>(data, size);
input->freeze();
if ( ! _parser->parse1 )
throw InvalidUnitType(
fmt("unit type '%s' cannot be used as external entry point because it requires arguments",
_parser->name));
if ( _parser->context_new ) {
if ( _context )
DRIVER_DEBUG("context was provided");
else
DRIVER_DEBUG("no context provided");
}
hilti::rt::profiler::stop(profiler);
_resumable = _parser->parse1(input, {}, _context);
if ( ! *_resumable )
hilti::rt::internalError("block-based parsing yielded");
return Done;

If any iterators into the stream are invalidated after parse1, seems the stream would not necessarily need to own the data.

This might improve performance for the spicy-quic analyzer when crunching through large transfers.

Relates to #1644

My takeaway from #1644 was that introducing a non-owning Chunk introduces new overhead even in code not making use of it since it partially undos (the spirit of) the optimizations done in #1607, so we probably wouldn't want to use this approach here.

Since here the Stream is always fully consumed we could instead introduce a non-owning Stream for this problem. The naive zeroth implementation could just be a non-owning class derived from Stream which stores a string_view into the data; that would still incur the overhead of creating the base Stream, but might already bring sufficient perf improvements.