enclave-networks/Enclave.FastPacket

Strategy to buffer or queue packets

Closed this issue · 6 comments

First off, this library looks great, thank you for releasing it!

For layer 7 protocol inspectors, retransmission queues, and other use cases, you often need the functionality to queue packets for later reference. As far as I understand due to the restrictions of ref struct, I won't be able to do this directly with FastPacket packets.

What would be the recommended way to store/buffer/queue FastPacket packets for later use?
I assume store a copy of the underlying raw byte[] data buffer of the packet and new up a FastPacket packet each time you want to access the stored packet?

Do you think there is a use case for a Memory<T> backed generator that is not based on ref struct? Perhaps exposed via something like a XYZPacketSpan.ToMemoryPacket()?

I definitely do see a "non-ref struct" way to wrap the decoder somehow.
Without it, you would have to store a reference to the underlying buffer, and then new up the packet each time, as you say.

The only real reason I haven't yet implemented it is that I wasn't sure what the "best" way to create that would be.

There's a few options. First off, there's a very simple approach I've used in the past which is basically just to create a holding type which gives access to a Span property to return the ref struct:

struct EthernetPacket 
{
  private readonly Memory<byte> _buffer;
  
  public EthernetPacket(Memory<byte> buffer)
  {
    _buffer = buffer;
  }

  public EthernetPacketSpan Span => new EthernetPacketSpan(buffer.Span);
}

Simple, and achieves the goal, but does require an extra call to Span when you use it. It does have the (positive?) side-effect of effectively forcing you into accessing the packet as a ref struct early, for more efficient code.

I believe this is why the Memory<byte> type is often just a holding type for accessing an instance of Span<byte>.

However if you're reading the packet in an async method, you cannot declare a local variable for the span, so you end up either adding a local function to interrogate, or calling the Span property repeatedly. It's generally a good idea (I believe) to avoid calling the Memory<byte>.Span property too often, because there's a chunk of code in the Memory type to ensure safe conversion to the Span.


Next option is to expand the properties of the packet directly into the struct, and read each as you access them.

However, we need to consider that most of the BinaryPrimitives.Read* methods we use in the generated decoder want a Span type, so we'd need to convert to that at some point.

One way of doing this might be:

public readonly partial struct EthernetPacket
{
    private readonly Memory<byte> _buffer;

    public EthernetPacket(Memory<byte> buffer)
    {
        _buffer = buffer;
    }

    public HardwareAddress Destination { get => new EthernetPacketSpan(_buffer.Span).Destination; }

    // ...snip
}

I'm not sure I like this though, because it requires repeated access to the .Span property of the Memory<byte> on each property access, so this type becomes less performant.

That said, if you want performance, maybe accessing via a non-reg struct isn't right anyway, and we just state that this way of representing it is slower?


One other option, especially if you want to repeatedly access the same field a lot on a single packet, is to read all the properties at construction time:

public readonly partial struct EthernetPacket
{
    public EthernetPacket(Memory<byte> buffer)
    {
        var span = buffer.Span;

        Destination = new HardwareAddress(span.Slice(0, HardwareAddress.Size));
        Source = new HardwareAddress(span.Slice(0 + HardwareAddress.Size, HardwareAddress.Size));
    }

    public HardwareAddress Destination { get; }

    public HardwareAddress Source { get; }
}

However, that means that:

a. You end up reading all the properties of the structure at the beginning, even if you're only interested in one of them.

b. The size of the struct on the stack will grow quite a bit.


I think it's also possible that we could end up with more than one of these options by default, or a blend of them.

I will say that if you want to store a reference to a packet, and later read a subset of the fields in that packet, with the highest performance possible, the first approach is technically best, but perhaps alternatives would also be useful.

Thoughts on the above @bcronje?

*Disclaimer - I haven't used Span much apart from a library consumer point of view, so my knowledge on the finer details and best practice of Span might not be 100% accurate.

Option 1 is pretty much what I had in mind, what you term a holding type. I like that this forces you to still use the Span packet for any operations on the packet, and if you access it early on there will be limited casts between Memory and Span.
Is there a way to convert from Span<byte> back to Memory<byte> that does not involve a copy? In other words, would it be possible to start off with a Span packet and use that for the fast path, and only if you need to store the packet convert the Span to Memory without involving a copy? Or would you need to start with the holding type?

Out of interest, for Enclave's use case, where does the memory buffer come from? Is it a pointer to unmanage memory that the NIC/driver owns, or do you provide managed buffers to the NIC and wrap span around that?

There isn't a good way to go from Span<byte> to Memory<byte> without a copy that I'm aware of. There are ways through the use of the low-level/unsafe MemoryMarshal class to get back from Span<byte> to the underlying buffer, but those are messy at best. I believe it's sort of intentional on the runtime's part to make it a largely unidirectional flow from Memory->Span.

I suspect you'd start with the holding type, then go from there. So I suspect we end up with EthernetPacket and EthernetPacketReadOnly holding types, with Memory<byte> and ReadOnlyMemory<byte> respectively.

Re:Enclave fabric, basically the latter. In Enclave, the buffers that hold ethernet data are typically managed pooled buffers allocated on the pinned object heap (POH), with unsafe pointers to those buffers provided to the adapter. Real-time analysis of those packets happens directly against those buffers, with no copies. Once you go beyond real-time analysis as the packet is going to or coming from the device, a copy has to happen.

Thank you for the info and insights @enclave-alistair

@bcronje, I'm going to re-open this because I think we should add the described holding types to the library, and this is a handy issue to track that change.

Added Ethernet and IP Packet holding types in 7db3503.