google/flogger

Refactoring SimpleMessageFormatter

cslee00 opened this issue · 4 comments

Hey folks,

Starting to contemplate untangling the various responsibilities in com.google.common.flogger.backend.SimpleMessageFormatter, will fork/branch here (no code created there yet): https://github.com/cslee00/flogger

From an initial review SimpleMessageFormatter has the below responsibilities; expecting that additional responsibilities will be identified as refactoring progresses.

  1. Static utility methods to assist with message formatting;
  2. Message "building" responsibilities inherited from MessageBuilder
  3. Message argument handling responsibilities from ParameterVisitor interface implementation
  4. Creating/formatting contextual information (tags & additional keys)
  5. Dispatch created message to SimpleLogHandler interface
  6. Orchestration of all the above

The tight coupling of these responsibilities prohibits composing solutions that use parts of these capabilities; for example, formatting the message template (w/ arguments) as text but passing all the other data (throwable, tags, etc) along to a back-end to handle.

Goal is to break apart this monolithic class into a few pieces with specific responsibilities such that they can be composed/orchestrated in different manners.

Thoughts / considerations welcome!

Sounds like a plan. Note that SimpleMessageFormatter was never intended to be more than it is. It was always known not to be a great general purpose solution (hence the name) so rather than refactor it, I'm more inclined to just make a better replacement for it. It's a shame all the backends decided to use it (I assume for convenience) rather than considering what the best option for them is.

I'd rather have semi-duplicated code in different backends to handle backend formatting expectations than try and create the perfect all singing & dancing formatter for everyone.

Originally formatting was a "frontend" concern and that was a mistake, so I worked hard to move it out to the backend (since a backend might not even do formatting at all and just ship the data out in a protocol buffer or similar). So whatever improvements we come up with need to not assume that a backend must format things (e.g. by adding format specific code to common backend APIs).

A few well placed static helpers might be fine to share/support long term, but each backend could just live with it's own completely separate formatter as far as I'm concerned.

The parsers were written to handle existing formatting in Google.

Printf format was written with the idea that other people would extend it, but now it seems that that's probably a bad idea. Even if some of the legacy Java format placeholders could have better replacements (e.g. date/time stuff) I'm now convinced that the issues caused by being not 100% compatible with String.format() are not worth it.

I'll leave that there though if someone ever did want to extend it for their own fluent loggers.

The "Brace Style" parser is simplistic and not longer used in Google (it was for awkward cases where a helper class was using {n} style placeholders with a wrapper). I should raise an issue to remove that really since it isn't full MessageFormat syntax and I don't want to maintain it forever.

The parser is far faster than java.util.Formatter for most cases and provides a template which could (in theory) be cached for reuse. However in practice it's not been needed.

The parser API with its callbacks is something I'm expecting to be shared (i.e. take printf message + args and append to some Appendable). I'd want callbacks to be available to support structured logging too. That's about the extent of what I think is worth codifying into an API for this.

Logging tags is painful right now and the code that's there is a hack. Ideally there'd be a Log4J style formatter that can pull out specific tag values and put them in different places. E.g. having a format specifier like

"${timestamp:iso} [${level}]:${tag:status>' '}${message}${tags<' [context='>']'}"

To produce:

2020-03-24:16:52:23.3456 [INFO]:STATUS_TAG This is the formatted message [context=...]
  • Common timestamp formats known
  • Prefix '<' and postfix '>' strings for optional data
  • Clever enough to know that "${tags}" is the remaining tags after any named tags are formatted

Obviously format is very TBD.

Can't really comment on how it's done inside Google I'm afraid.

And unless it wasn't clear, reusing the Log4J2 stuff would also be fine by me if there are no "show stoppers".

https://logging.apache.org/log4j/2.x/manual/layouts.html

Especially the bits about PatternLayout and MDC stuff.