reactor/reactor-netty

Tracing Context not logged when replacing ReactorNettyHttpTracing with standard HttpClient/HttpServer integration

grassehh opened this issue ยท 28 comments

Describe the Bug

Spring Boot 2 with Spring Sleuth -> Spring Boot 3 with Micrometer migration
Hello, I'm aware that as explained in the Javadoc, ReactorNettyHttpTracing will be deprecated in next major version.
As a consequence, I am trying to find an alternative.
As explained here and on the official documentation, it is said that one need to use the built-in metrics of Netty Reactor.

However in a Spring Boot 3 environment with Micrometer, I am not sure of how to do this, but it seems that it is not working correctly.

Steps to Reproduce

  • Clone this project
  • Checkout issues/reactor-netty branch
  • Start the app and call the server as explained in the README

You will see that in the webClient log, there is no baggage logged in the MDC:

-MDC Context: [] - -2023-07-12T19:52:24.119+02:00 -TRACE 42957 --- [ctor-http-nio-4] org.zalando.logbook.Logbook              : {"origin":"local","type":"request","correlation":"896af375872f07c2","protocol":"HTTP/1.1","remote":"google.com/216.58.214.174:443","method":"GET","uri":"https://google.com/","host":"google.com","path":"/","scheme":"https","port":"443","headers":{"accept":["*/*"],"accept-encoding":["gzip"],"b3":["64aee8571dbceb4af15d6f787e96756f-4bba3d9ce759afef-1"],"host":["google.com"],"mybaggagecontroller":["myBaggageControllerValue"],"mybaggagefilter":["myBaggageFilterValue"],"user-agent":["ReactorNetty/1.1.8"]}}

However if you edit ClientConfiguration.httpClientCustomizer method by replacing this code:

it.metrics(true) { uriTagValue: String ->
        uriTagValue //Not sure what to do here ?
}

with this code:

reactorNettyHttpTracing.decorateHttpClient(it)

and do the same test, you will see that all the baggages are now correctly logged (traceId, spanId, myBaggageFitler, myBaggageController):

-MDC Context: [traceId=64aee916ac416098ec3c881a8c596d93, spanId=ba1148154721de18, myBaggageController=myBaggageControllerValue, myBaggageFilter=myBaggageFilterValue] - -2023-07-12T19:55:34.836+02:00 -TRACE 43190 --- [ctor-http-nio-4] org.zalando.logbook.Logbook              : {"origin":"local","type":"request","correlation":"c16054ff4fc614e6","protocol":"HTTP/1.1","remote":"google.com/216.58.214.174:443","method":"GET","uri":"https://google.com/","host":"google.com","path":"/","scheme":"https","port":"443","headers":{"accept":["*/*"],"accept-encoding":["gzip"],"b3":["64aee916ac416098ec3c881a8c596d93-7ce6c146db5e8b18-1"],"host":["google.com"],"mybaggagecontroller":["myBaggageControllerValue"],"mybaggagefilter":["myBaggageFilterValue"],"user-agent":["ReactorNetty/1.1.8"],"X-B3-ParentSpanId":["7ce6c146db5e8b18"],"X-B3-Sampled":["1"],"X-B3-SpanId":["ba1148154721de18"],"X-B3-TraceId":["64aee916ac416098ec3c881a8c596d93"]}}

Expected result

The baggages in tracing context (traceId, spanId and baggages in scope) should be logged in both Server/Client request/responses when using the built-in HttpClient.metricsand HttpService.metricsmethods instead of the deprecated ReactorNettyHttpTracing class.

Thanks for the help !

@grassehh Please provide reproducible example.

@grassehh Please provide reproducible example.

Hello, I updated the whole description and provided a sample. Thanks.

@grassehh Can you try this?

@Configuration
class ClientConfiguration {
    @Bean
    fun httpClientCustomizer() = ReactorNettyHttpClientMapper {
       it.doOnRequest { request, connection ->
           val handler: LogbookClientHandler? = connection.channel().pipeline().get(LogbookClientHandler::class.java)
           if (handler != null) {
               connection.replaceHandler("LogbookClientHandler", TracingChannelDuplexHandler(handler))
           }
       }
    }

    @Bean
    fun webClientCustomizer(clientHttpConnector: ClientHttpConnector) = WebClientCustomizer {
        it.clientConnector(clientHttpConnector)
    }

    @Bean
    fun webClient(webClientBuilder: WebClient.Builder) = webClientBuilder.build()

    class TracingChannelDuplexHandler(private val delegate: ChannelDuplexHandler): ChannelDuplexHandler() {

        override fun channelRead(ctx: ChannelHandlerContext, msg: Any) {
            ContextSnapshot.setAllThreadLocalsFrom(ctx.channel()).use { scope ->
                delegate.channelRead(ctx, msg)
            }
        }

        override fun write(ctx: ChannelHandlerContext?, msg: Any?, promise: ChannelPromise?) {
            ContextSnapshot.setAllThreadLocalsFrom(ctx!!.channel()).use { scope ->
                delegate.write(ctx, msg, promise)
            }
        }
    }
}

This is described in the reference documentation in the section for custom ChannelHandlers
https://projectreactor.io/docs/netty/release/reference/index.html#_access_current_observation_4

Thanks for the answer. I confirm the solution works. I have updated the branch consequently. I have slightly simplified the client customizer by naming the method logbookNettyClientCustomizerso that it overrides Logbook's default customizer, without having to replace the handler.

However I have few more questions/remarks about it:

  • setAllThreadLocalsFrom method is deprecated so I used ContextSnapshoFactory.setThreadLocalsFrom instead
  • Do you think TracingChannelDuplexHandler could be something worth adding into Spring Boot 3, and the decoration of the logbook handlers into Logbook project ?
  • I tried using the same mecanism for the HttpServer, but it does not seem to work. Could it be an issue with the LogbookServerHandler?

The server customizer looks like this :

@Bean
fun logbookNettyServerCustomizer(logbook: Logbook, contextSnapshotFactory: ContextSnapshotFactory) =
    NettyServerCustomizer { server ->
        server.doOnConnection { connection: Connection ->
            connection.addHandlerLast(
                TracingChannelDuplexHandler(
                    LogbookServerHandler(logbook),
                    contextSnapshotFactory
                )
            )
        }
    }

and when calling the app, the request/response logs have empty MDC. Example of request log:

-MDC Context: [] - -2023-07-13T12:59:31.181+02:00 -TRACE 66724 --- [ctor-http-nio-3] org.zalando.logbook.Logbook              : {"origin":"remote","type":"request","correlation":"ad43f64b644c7839","protocol":"HTTP/1.1","remote":"/127.0.0.1:60065","method":"GET","uri":"http://localhost:8080/test","host":"localhost","path":"/test","scheme":"http","port":"8080","headers":{"Accept":["*/*"],"Authorization":["XXX"],"Host":["localhost:8080"],"User-Agent":["curl/7.88.1"]}}

Again, the sample has been updated you can check it out.

setAllThreadLocalsFrom method is deprecated so I used ContextSnapshoFactory.setThreadLocalsFrom instead

The ContextSnapshoFactory.setThreadLocalsFrom is newer functionality, if you use context propagation version that supports it then use what is recommended as replacement for setAllThreadLocalsFrom

Do you think TracingChannelDuplexHandler could be something worth adding into Spring Boot 3

Decorating Netty events with TracingChannelDuplexHandler may affect the performance, because of that in the example I implemented ONLY those that are needed by Logbook. Other solutions might need to implement other events, so IMO there isn't something by default that can be added to Spring Boot.

and the decoration of the logbook handlers into Logbook project ?

Context Propagation library is generic one, it does not depend on any specific project so it is up to Logbook project whether they want to use it.

I tried using the same mecanism for the HttpServer, but it does not seem to work. Could it be an issue with the LogbookServerHandler?

As you can see for the client we don't need anything specific from Reactor Netty, just a custom Netty ChannelHandler.
This is not the case for the server. The issue is the following:

request -> Reactor Netty -> Spring framework creates Observation

response -> Spring framework closes Observation -> Reactor Netty

As you can see everything that is NOT in the scope of the framework cannot see the Observation. This is the case with Logbook.

In this case you will need Reactor Netty <-> Micrometer integration.

I need to check for a fix in Spring and I'll try to come up with a customiser for the server.

So if I understand correctly and sum up, this means that it requires Logbook to implement Reactor Netty <-> Micrometer integration in order to:

  • Decorate both LogbookClientHandler and LogbookServerHandler with a handler similar to you TracingChannelDuplexHandler
  • Be part of the observation scope

So if I understand correctly and sum up, this means that it requires Logbook to implement Reactor Netty <-> Micrometer integration in order to:

  • Decorate both LogbookClientHandler and LogbookServerHandler with a handler similar to you TracingChannelDuplexHandler
  • Be part of the observation scope

mmm not exactly ...
May be if Logbook wants to take part of the Spring Framework server instrumentation then it needs something else than Netty ChannelHandler ... I don't have deep knowledge in Logbook though ...

@grassehh This should be working for the server:

@Configuration
class ServerConfiguration {

    @Bean
    fun logbookNettyServerCustomizer(logbook: Logbook, contextSnapshotFactory: ContextSnapshotFactory) =
        NettyServerCustomizer { server ->
            server.doOnConnection { connection: Connection ->
                connection.addHandlerLast(
                    TracingChannelDuplexHandler(
                        LogbookServerHandler(logbook),
                        contextSnapshotFactory
                    )
                )
            }
                .metrics(true) { uriTagValue: String ->
                    uriTagValue
                }
        }


    @Configuration
    internal class Config {
        @Autowired
        var observationRegistry: ObservationRegistry? = null

        @PostConstruct
        fun setup() {
            ObservationThreadLocalAccessor.getInstance().observationRegistry = observationRegistry!!
            Metrics.observationRegistry(observationRegistry!!)
        }
    }
}

For the uriTagValue

.metrics(true) { uriTagValue: String ->
    uriTagValue
}

The idea is to provide a templating if possible e.g. if the uri is /user/id/123456 you can decide to template it like this /user/id/{n}. See for more information the reference documentation https://projectreactor.io/docs/netty/release/reference/index.html#_metrics_4

In order to avoid a memory and CPU overhead of the enabled metrics, it is important to convert the real URIs to templated
 URIs when possible. Without a conversion to a template-like form, each distinct URI leads to the creation of a distinct tag,
 which takes a lot of memory for the metrics.

I updated the branch with your suggested server implementation.
The traceId and spanId are now correctly logged in both server and client calls.
However the custom baggages myBaggageFilter and myBaggageController are not logged.
Interested fact is that commenting the reactor hook call Hooks.enableAutomaticContextPropagation() makes the custom baggages appear in the server logs, but disappear from the client logs.

I updated the branch with your suggested server implementation.
The traceId and spanId are now correctly logged in both server and client calls.
However the custom baggages myBaggageFilter and myBaggageController are not logged.
Interested fact is than commenting the reactor hook call Hooks.enableAutomaticContextPropagation() makes the custom baggages appear in the server logs, but disappear from the client logs.

@chemicL @marcingrzejszczak PTAL

@grassehh can you please change the project to Java? Let's try fixing one problem at a time and adding Kotlin makes things more complicated. Thank you

@grassehh Closing this issue. If you can provide the requested example in Java, showing the issue, we can reopen it.

Hello @violetagg @marcingrzejszczak
I have pushed a reproducible Java sample on the issues/reactor-netty-java branch.

As you can see, the custom baggages are still not logged in the server request/response logs:

-MDC Context: [traceId=64d76b4342b618d976088eeb28e25736, spanId=76088eeb28e25736] - -2023-08-12T13:21:40.021+02:00 -TRACE 30009 --- [ctor-http-nio-3] org.zalando.logbook.Logbook              : {"origin":"remote","type":"request","correlation":"c1cd060e88cd982a","protocol":"HTTP/1.1","remote":"/127.0.0.1:55665","method":"GET","uri":"http://localhost:8080/test","host":"localhost","path":"/test","scheme":"http","port":"8080","headers":{"Accept":["*/*"],"Authorization":["XXX"],"Host":["localhost:8080"],"User-Agent":["curl/8.1.2"]}}
-MDC Context: [traceId=64d76b4342b618d976088eeb28e25736, spanId=a5644ccf373a5063, myBaggageController=6df0fe48-d37d-4ff0-ade9-92cf2bd1e187, myBaggageFilter=b61464b0-3c63-4932-9b74-3aa0f461a2f7] - -2023-08-12T13:21:40.099+02:00 -DEBUG 30009 --- [     parallel-1] c.g.app.controller.SampleController      : Log from inside the controller. This should contain the baggage called myBaggageController
-MDC Context: [traceId=64d76b4342b618d976088eeb28e25736, spanId=9fac80fd257c7ce6, myBaggageController=6df0fe48-d37d-4ff0-ade9-92cf2bd1e187, myBaggageFilter=b61464b0-3c63-4932-9b74-3aa0f461a2f7] - -2023-08-12T13:21:40.281+02:00 -TRACE 30009 --- [ctor-http-nio-4] org.zalando.logbook.Logbook              : {"origin":"local","type":"request","correlation":"849bf9fd8dc5f3c8","protocol":"HTTP/1.1","remote":"google.com/142.250.179.110:443","method":"GET","uri":"https://google.com/","host":"google.com","path":"/","scheme":"https","port":"443","headers":{"accept":["*/*"],"accept-encoding":["gzip"],"b3":["64d76b4342b618d976088eeb28e25736-9fac80fd257c7ce6-1"],"host":["google.com"],"mybaggagecontroller":["6df0fe48-d37d-4ff0-ade9-92cf2bd1e187"],"mybaggagefilter":["b61464b0-3c63-4932-9b74-3aa0f461a2f7"],"user-agent":["ReactorNetty/1.1.8"]}}
-MDC Context: [traceId=64d76b4342b618d976088eeb28e25736, spanId=9fac80fd257c7ce6, myBaggageController=6df0fe48-d37d-4ff0-ade9-92cf2bd1e187, myBaggageFilter=b61464b0-3c63-4932-9b74-3aa0f461a2f7] - -2023-08-12T13:21:40.312+02:00 -TRACE 30009 --- [ctor-http-nio-4] org.zalando.logbook.Logbook              : {"origin":"remote","type":"response","correlation":"849bf9fd8dc5f3c8","duration":31,"protocol":"HTTP/1.1","status":301,"headers":{"Alt-Svc":["h3=\":443\"; ma=2592000,h3-29=\":443\"; ma=2592000"],"Cache-Control":["private, max-age=2592000"],"Content-Length":["220"],"Content-Security-Policy-Report-Only":["object-src 'none';base-uri 'self';script-src 'nonce-qINUne2hFIUMqhU2jOciHw' 'strict-dynamic' 'report-sample' 'unsafe-eval' 'unsafe-inline' https: http:;report-uri https://csp.withgoogle.com/csp/gws/other-hp"],"Content-Type":["text/html; charset=UTF-8"],"Date":["Sat, 12 Aug 2023 11:21:40 GMT"],"Expires":["Sat, 12 Aug 2023 11:21:40 GMT"],"Location":["https://www.google.com/"],"P3P":["CP=\"This is not a P3P policy! See g.co/p3phelp for more info.\""],"Server":["gws"],"Set-Cookie":["CONSENT=PENDING+224; expires=Mon, 11-Aug-2025 11:21:40 GMT; path=/; domain=.google.com; Secure"],"X-Frame-Options":["SAMEORIGIN"],"X-XSS-Protection":["0"]},"body":"<HTML><HEAD><meta http-equiv=\"content-type\" content=\"text/html;charset=utf-8\">\n<TITLE>301 Moved</TITLE></HEAD><BODY>\n<H1>301 Moved</H1>\nThe document has moved\n<A HREF=\"https://www.google.com/\">here</A>.\r\n</BODY></HTML>\r\n"}
-MDC Context: [traceId=64d76b4342b618d976088eeb28e25736, spanId=76088eeb28e25736] - -2023-08-12T13:21:40.317+02:00 -TRACE 30009 --- [ctor-http-nio-3] org.zalando.logbook.Logbook              : {"origin":"local","type":"response","correlation":"c1cd060e88cd982a","duration":336,"protocol":"HTTP/1.1","status":200,"headers":{"Cache-Control":["no-cache, no-store, max-age=0, must-revalidate"],"content-length":["0"],"Expires":["0"],"Pragma":["no-cache"],"Referrer-Policy":["no-referrer"],"X-Content-Type-Options":["nosniff"],"X-Frame-Options":["DENY"],"X-XSS-Protection":["0"]}}

@grassehh I would suggest to open an issue either in Spring or Micrometer. I don't know how Reactor Netty can help with custom baggages. However if the issue appears to be in Reactor Netty, we can always reopen the issue.

Hey, I've updated Tracing version 1.1.x and 1.2.x with fixes related to baggage recently. Since this week is my Spring Cloud week I can look into checking this next week

Thanks. Should I still open an issue on Spring or somewhere else ? Or can I wait for your feedback next week here ?

I mean you can bump tracing to the latest 1.1.x snapshot and check it for yourself or wait for a week for me to check it

I mean you can bump tracing to the latest 1.1.x snapshot and check it for yourself or wait for a week for me to check it

Hello, I have upgraded the dependencies on the issues/reactor-netty-java branch:
spring-boot 3.1.3
logbook 3.4.0
micrometer-tracing-bom 1.1.5

Baggage still are not logged in the Server request/response logs (only traceId and spanId)

Have you used the baggage API properly? Here you can find an example https://github.com/micrometer-metrics/tracing/blob/main/micrometer-tracing-bridges/micrometer-tracing-bridge-brave/src/test/java/io/micrometer/tracing/brave/bridge/BaggageTests.java#L180-L210

The way I do it (you can check in the issues/reactor-netty-java branch of my sample project) is using the Tracer.createBaggageInScope()method as the Tracer.createBaggage() is deprectated.
I have tried to configure the tracing like in your example with these two lines without success:

ContextRegistry.getInstance().registerThreadLocalAccessor(new ObservationAwareSpanThreadLocalAccessor(tracer));
Hooks.enableAutomaticContextPropagation();

So far like I said, the issue is that the custom bagage (created through Tracer.createBaggageInScope() method), do not show up in the server request/response logs (logged by the LogbookServerHandlerclass). They do show in the controller code and in the WebClientcalls though (logged by LogbookClientHandler).

This used to work well with Spring Sleuth, and it no longer works in Spring Boot 3 w/ Micrometer Tracing, since I replaced the deprecated ReactorNettyHttpTracing class with the standard HttpServer handler mecanism as suggested by violetagg.

Hi, @marcingrzejszczak @violetagg.
After some debugging, I noticed that SimpleObservationRegistry.getCurrentObservation() returns null inside the TracingChannelDuplexHandler.write() call. Not sure if this is normal ? Maybe the server handler is not correctly configured ?

The call stack is as follow:

write:41, TracingChannelDuplexHandler (com.grassehh.app.configuration)
setThreadLocalsFrom:109, DefaultContextSnapshotFactory (io.micrometer.context)
setAllThreadLocalsFrom:125, DefaultContextSnapshotFactory (io.micrometer.context)
setThreadLocal:101, DefaultContextSnapshot (io.micrometer.context)
getValue:32, ObservationThreadLocalAccessor (io.micrometer.observation.contextpropagation)
getValue:109, ObservationThreadLocalAccessor (io.micrometer.observation.contextpropagation)
getCurrentObservation:38, SimpleObservationRegistry (io.micrometer.observation)

the solution with TracingChannelDuplexHandler worked fine in spring boot 3.1* but upgrading it to 3.2.0 broke the solution, now the traceid in webclient is different from others any solution ?

@Ricard-Kollcaku Please provide reproducible example.

the solution with TracingChannelDuplexHandler worked fine in spring boot 3.1* but upgrading it to 3.2.0 broke the solution, now the traceid in webclient is different from others any solution ?

@violetagg I've updated my issue with the same regression observed (reproducer here on branch issues/micrometer-tracing)

@grassehh @Ricard-Kollcaku Can you please test with Spring Framework 6.1.2-SNAPSHOT?

@grassehh @Ricard-Kollcaku Can you please test with Spring Framework 6.1.2-SNAPSHOT?

Confirm this works for me with 6.1.2-SNAPSHOT ! Thanks you !
Any ETA for the 6.1.2 version in Spring Boot ?

@Ricard-Kollcaku I'm closing this one. Reactor Netty version for Spring Boot 3.x versions is one and the same - 1.1.x.
Please try the Spring Framework 6.1.2-SNAPSHOT.