hypertrace/javaagent

✨ Add support for filtering requests to APIS defined with JAX-RS

Opened this issue · 6 comments

Use Case

We should support filtering requests made to APIs defined with the JAX-RS API, just as we do for the Java EE Servlet API.

Proposal

I think it will likely make sense to leverage similar strategies for instrumenting JAX-RS resources as is done by OpenTelemetry (in these modules). We'll likely be able to do the bulk of the work according to the JAX-RS API with minor tweaks here and there for each JAX-RS implementation we want to support.

Questions to address (if any)

The specific use case I'm trying to help solve is a Jersey JAX-RS web application running on Jetty leveraging Dropwizard. It's possible it could make sense to add support for this use case by targeting Jetty internals like we do for Netty, which is another solution worth considering

@pavolloffay I'm taking a look at the dropwizard Jersey/Jetty use case! I hear you were looking at this a while back, so if you've got any opinions on where I should start out let me know!

What exactly do you mean by fitlering? Do you mean disabling tracing for predefined endpoints/URL patterns?

Sorry, I should have been more clear! I mean supporting the reporting of the request body/headers and blocking a request using an implementation of org.hypertrace.agent.filter.api.Filter to an API endpoint defined with JAX-RS.

I did some exploratory testing of this and found some surprising results. Using a Dropwizard sample application, we report the request/response bodies, headers, and can block requests using a org.hypertrace.agent.filter.api.Filter implementation. I was really surprised by this, but it turns out the dropwizard-jersey module registers a javax.servlet.Filter implementation as part of the request handling chain called AllowedMethodsFilter. The Hypertrace Servlet30AndFilterInstrumentation transforms the AllowedMethodsFilter and allows us to capture the data we need and allow for a org.hypertrace.agent.filter.api.Filter implementation to consider blocking the request.

However, it is not necessary to run Jersey JAX-RS application on Jetty with a Java EE Servlet filter. If you build a simple Jersey JAX-RS application running on Jetty but without using Dropwizard, we won't report the request/response bodies and headers and the org.hypertrace.agent.filter.api.Filter implementation is never asked to evaluate the request headers. We'd need some additional instrumentation to make these things work for this use case

Usually, JAX-RS runs on top of the servlet implementation (or netty for reactive ones).

However, it is not necessary to run Jersey JAX-RS application on Jetty with a Java EE Servlet filter. If you build a simple Jersey JAX-RS application running on Jetty but without using Dropwizard, we won't report the request/response bodies and headers and the org.hypertrace.agent.filter.api.Filter implementation is never asked to evaluate the request headers.

This is surprising, I guess it also depends on how the application is bootstrapped because Jersey for sure offers servlet integration. If what you are describing is true then Jersey implements directly Jetty hander/API.

There might be different solutions to this:

  • instrument servlet container (Jetty)
  • instrument server-side JAX-RS

in both of these scenarios you have to be sure that instrumentations on multiple levels do not interfere (e.g. either servlet or JAX-RS instrumentation runs). I would go with the second option for now as it provides better coverage.

Note that OTEL agent instruments servlet containers/APP server (e.g. to create events for unmatched requests) and in the past there was a bug with Undertow - the undertow instrumentation finished early span causing our instrumentation to be noop. A finished span is not recording events and in this case our instrumentation is noop. Before proceeding with implementing additional instrumentation I would run Java.lang.Thread.dumpStack() to see the stacktrace and make sure that no servlet is executed.

This is surprising, I guess it also depends on how the application is bootstrapped because Jersey for sure offers servlet integration. If what you are describing is true then Jersey implements directly Jetty hander/API.

Yes! There are two ways, as I understand it, to run Jersey on Jetty

I was referring to the jersey-container-jetty-http artifact just because that's how I personally have run Jersey JAX-RS applications using the JettyHttpContainerFactory to bootstrap the application.

I'm leaning towards instrumenting the HTTP servers/containers because I think we should be able to leverage some existing instrumentation from OTEL and the alternative would probably mean writing instrumentation for each JAX-RS implementation we come across.

I'm leaning towards instrumenting the HTTP servers/containers because I think we should be able to leverage some existing instrumentation from OTEL and the alternative would probably mean writing instrumentation for each JAX-RS implementation we come across.

This is not how JAX-RS works. You should be able to write single JAX-RS instrumentation and use it for all implementations. The span lifecycle (creation, finish, error capture) is already handled by OTEL server instrumentations (e.g. jetty in this case).

As I mentioned before the direct server instrumentation might have only a single advantage - capturing unmapped requests (e.g. requests that target URL outside of the application context path). However, there is pre-matching discriminator in JAX-RS that might solve this https://www.logicbig.com/tutorials/java-ee-tutorial/jax-rs/filters-pre-matching.html.

Also, have a look at Smallrye JWT how handles blocking in JAX-RS filter: https://github.com/smallrye/smallrye-jwt/blob/main/implementation/jwt-jaxrs/src/main/java/io/smallrye/jwt/auth/jaxrs/JWTAuthenticationFilter.java#L82