zeek/spicy

Allow to access original field value before `&convert`

Closed this issue · 5 comments

In some situations it could be handy for field hooks to have access to the original value before a &convert takes place.

One example I encountered would be parsing a container:

type Message = unit {
	a: uint8;
	b: uint8;
};

public type PDUs = unit {
	m: (Message &convert=$$.a)[] &until=(orig-$$.b == 0);
};

Another one could be &requires:

public type PDUs = unit {
	m: Message &convert=$$.a &requires=(orig-$$.b == 23);
};

Should we really introduce something like this is definitely would require extra syntax since we wouldn't have the overhead for storing and keeping the old value for every parse involving conversions. It would also require a way to unambiguously refer to the original value if multiple &convert attributes are present (e.g., xs: (uint8 &convert=$$+1)[] &convert=1). My gut feeling is that this is an a feature not needed often and not too hard to work around.

Like discussed on Slack, for your first use case with &until you could build logic around container foreach and stop, e.g.,

module foo;

type Message = unit {
    a: uint8;
    b: uint8;
};

public type PDUs = unit {
    var m_done: bool = False; # Flag whether we should stop iterating.
    m: (Message &convert=$$.a)[] foreach {
        if (self.m_done)
            stop;
    }
};

function get_a(m: Message, inout pdus: PDUs): uint8 {
    if (m.b - m.a == 0)
        pdus.m_done = True;
    return m.a;
}

The second use can be addressed by attaching the &requires to the parsed Message instead of the converted value,

public type PDUs = unit {
    m: (Message &requires=($$.a - $$.b == 23)) &convert=$$.a;
};

Just noticed that my original examples are confusing. By orig-$$ I meant the original $$ value.

Regarding the workaround, I assume get_a should be used in the &convert, right?

m: (Message &convert=get_a($$, self))[] foreach {
  ...

To be honest, instead of using this code, I would rather use a function that loops the container again and does the conversion. That's way easier to follow.

Regarding the workaround, I assume get_a should be used in the &convert, right?
Yepp, correct, I forgot changing that part of the example.

To be honest, instead of using this code, I would rather use a function that loops the container again and does the conversion. That's way easier to follow.

If you can store the full vector that's not a bad approach; what I tried here was something as close to &until as possible, e.g, not introducing another container or loop over the full data. Re: maintainability, the same argument could be made for a dedicated syntax to refer to the original.

Yeah, I'm also not very gang-ho about this; would add quite a bit of complexity both conceptually (another reserved/magic identifier of some kind) and internally (scope and life-time mgmt). Given that there other ways to approach this, I suggest we wait and see if more uses case come up where this would help considerably.

I'll go ahead and close this as it doesn't seem worth the overhead.