ethercrow/opentelemetry-haskell

Eventlog?

Closed this issue · 8 comments

This seems like a great project, but why are you not using the eventlog to record telemetry information?

It will have much less overhead than instrumenting in Haskell, will also work correctly with multiple threads and will enable correlation with other RTS statistics such as residency.

In combination with ghc-eventlog-socket you can flexibly redirect the eventlog wherever you see fit.

Hi Matthew!

I know basically nothing about GHC eventlog, so the idea never crossed my mind. I just did 10 minutes of googling and am not really sure what you mean, specifically which part of the library to replace with eventlog.

Do I understand correctly that the entire eventlog API for me as a library author is traceEvent and traceEventIO from Debug.Trace?

If so, is the idea to do roughly this?

  • OpenTelemetry API functions like withSpan and setTag don't cause any bookkeeping in the instrumented process and just call traceEventIO with payloads like beginSpan id=123 operation="foo", setTag k=v on span 123
  • Eventlog is streamed to another process that transforms the data into whatever format OpenTelemetry backend service expects and sends it there

Or do I misunderstand entirely?

On a very slightly related note: what do you think about using a hardcore gamedev profiler like https://github.com/wolfpld/tracy instead of ThreadScope for realtime profiling?

Tracy has a C API, so instrumenting the RTS should be possible as well as wrapping it as a Haskell API as well.

@ethercrow That is exactly the idea I had in mind. From GHC 8.8 you can also directly emit ByteStrings (see here) so you can send a put a more structured format into the eventlog.

I think using existing tooling is a great idea, and why I was interested in looking at the project in the first place.

Cool, I'll try implementing this streaming scheme then.

Maybe I'll rename the implementation I currently have into OpenTelemetry.InProcess or something. I imagine it could still be useful for operational simplicity.

If you need some help then feel free to ping me in the #ghc channel on freenode.

Span overhead in single digit microseconds looks nice.

Screen Shot 2020-04-06 at 6 03 38 PM

Screen Shot 2020-04-06 at 6 04 14 PM

The master branch now uses eventlog.