DataDog/libddwaf

WAF API for gRPC

Julio-Guerra opened this issue · 3 comments

I think it's worth logging here the fact that the current integration of the WAF for gRPC is pretty hacky: in order to be able to trigger the same rules multiple times, we instantiate one WAF context per WAF run. This is due to the fact this is the first time we encounter addresses that can happen multiple times during the WAF context lifetime.

A new API should somehow allow running the WAF with new address values that were already sent before.

Note that it also currently has the issue of keeping the memory used by the values for the time of the WAF context, so this new API should also handle this.

This hack is currently described in our gRPC RFC at https://datadoghq.atlassian.net/wiki/spaces/APS/pages/2278064284/gRPC+Protocol+Support#Implementation-Milestones

Yeah, we need a way to say "run with the currently stored values, but don't store these new ones". The hack I have for this in Java is that for running with grpc addresses, a new context is created anew with also the previously submitted addresses and immediately discarded afterwards.

This sounds reasonable enough, however let's explore the behaviour of the WAF with different implementation. The context has a set of addresses {a, b, c} of which one, let's say a, needs to be provided to the WAF repeatedly and let's also assume the values don't change for the sake of the example.

If you didn't recreate the context every time you pass a:

  • The first run would contain {a, b, c}.
    • Assuming there was a match on a, the second time you pass a, there would be no match.

If you recreate the context:

  • The first run would contain {a, b, c}.
    • Assuming there was a match on a, the second time you pass a (in a new context), there would be a match.
    • Assuming you're creating the new context and also passing b and c, if there was a match on either of them, there would be another match even though there was no new data.

With the new API to clear a specific address:

  • The first run would contain {a, b, c}.
    • Assuming there was a match on a, the second time you pass a, there would be a match.
    • There would be no matches on b and c on further runs, regardless of there being a match on the first run. This would be the case unless the new value on a causes further conditions to be processed on b and c as well.

To accomplish this last scenario we need to create a new API function, ddwaf_context_clear(<list of addresses>), which will:

  • Clear the relevant addresses from the object store.
  • Clear all cache matches.

However, we need to enforce executing all conditions on new parameters only, except for those which were never processed. This can't be accomplish as of yet, but some of the performance optimisations I am planning on adding would actually cover this scenario.

Does ephemeral addresses cover everything you wanted here @Julio-Guerra ?