Structured logging (JSON)
Closed this issue · 4 comments
Is your feature request related to a problem? Please describe.
Its pretty common these days to emit structured logs instead of textual logs. Its easier to parse and some cloud environments already has support for parsing and indexing them. My problem is mainly with GKE, textual logs to stderr/stdout are by default logged with severity of ERROR which is not really great as that brings a lot of noise.
Describe the solution you'd like
Ability to switch to json logging, a severity field would decide the level of the logged message. Example logged message:
{
"severity":"INFO",
"message":"Hello world.",
"time":"2020-10-12T07:20:50.52Z"
}
@levkk do we need to keep using the env_logger? This is a good use case for the tracing-subscriber. Like the examples below, we could display logs using the existing formatters. Both text
and pretty
formatters have color support.
Let me know your thoughts so I can send the PR.
text
=> current values - using full
formatter
2023-07-19T05:06:43.430838Z INFO pgcat: Welcome to PgCat! Meow. (Version 1.0.2-alpha3)
2023-07-19T05:06:43.447918Z INFO pgcat: Running on 0.0.0.0:6432
2023-07-19T05:06:43.447940Z INFO pgcat::config: Ban time: 60s
2023-07-19T05:06:43.447945Z INFO pgcat::config: Idle client in transaction timeout: 0ms
2023-07-19T05:06:43.447950Z INFO pgcat::config: Worker threads: 5
We could use a custom format to implement the current format if needed.
structured
=> using the json
formatter
{"timestamp":"2023-07-19T05:08:43.827969Z","level":"INFO","fields":{"message":"Welcome to PgCat! Meow. (Version 1.0.2-alpha3)","log.target":"pgcat","log.module_path":"pgcat","log.file":"src/main.rs","log.line":88},"target":"pgcat"}
{"timestamp":"2023-07-19T05:08:43.842202Z","level":"INFO","fields":{"message":"Running on 0.0.0.0:6432","log.target":"pgcat","log.module_path":"pgcat","log.file":"src/main.rs","log.line":149},"target":"pgcat"}
{"timestamp":"2023-07-19T05:08:43.842238Z","level":"INFO","fields":{"message":"Ban time: 60s","log.target":"pgcat::config","log.module_path":"pgcat::config","log.file":"src/config.rs","log.line":987},"target":"pgcat::config"}
{"timestamp":"2023-07-19T05:08:43.842252Z","level":"INFO","fields":{"message":"Idle client in transaction timeout: 0ms","log.target":"pgcat::config","log.module_path":"pgcat::config","log.file":"src/config.rs","log.line":988},"target":"pgcat::config"}
{"timestamp":"2023-07-19T05:08:43.842266Z","level":"INFO","fields":{"message":"Worker threads: 5","log.target":"pgcat::config","log.module_path":"pgcat::config","log.file":"src/config.rs","log.line":992},"target":"pgcat::config"}
debug
=> using the pretty
formatter
2023-07-19T05:09:59.427259Z INFO pgcat: Welcome to PgCat! Meow. (Version 1.0.2-alpha3)
at src/main.rs:88
2023-07-19T05:09:59.441273Z INFO pgcat: Running on 0.0.0.0:6432
at src/main.rs:149
2023-07-19T05:09:59.441295Z INFO pgcat::config: Ban time: 60s
at src/config.rs:987
2023-07-19T05:09:59.441302Z INFO pgcat::config: Idle client in transaction timeout: 0ms
at src/config.rs:988
2023-07-19T05:09:59.441308Z INFO pgcat::config: Worker threads: 5
at src/config.rs:992
our current log format (for comparison)
[2023-07-19T05:18:51.976008Z INFO pgcat] Welcome to PgCat! Meow. (Version 1.0.2-alpha3)
[2023-07-19T05:18:51.990326Z INFO pgcat] Running on 0.0.0.0:6432
[2023-07-19T05:18:51.990348Z INFO pgcat::config] Ban time: 60s
[2023-07-19T05:18:51.990353Z INFO pgcat::config] Idle client in transaction timeout: 0ms
[2023-07-19T05:18:51.990357Z INFO pgcat::config] Worker threads: 5
initial PR is here: #517
This is pretty awesome, although I'm quite a bit annoyed because of it as well. Fun story, this change as it rolled out broke my production infra!
So. Associated here is the warning-level message in pgcat when it's discarding a session to the pool due to "SET" being called, along with a few other things that are "warning" level logs which happen... a lot.
So. If it's fex. a PHP based backend, which always calls "SET" to make sure fex. the output for BYTEA format in postgres is canonical, and theres.. say, a few million calls to those functions every few minutes, we get a lot of log lines. Almost all identical, for a behaviour we don't really care much about. ( Discard is cheaper than a tcp+tls connection, I'm cool with that.)
But, here the log filtering from RUST_LOG variable changed, so the discard filters to ignore such lines are now logged. And well, that ate my log quota.
And that's the story of why logging is bad you should all go back to spooling logs to an unbuffered line printer instead.
( ps. how the bleep do I filter out specific spammy warning-level messages now? )
Tell me about it ..I go through my Sentry quota in two days and just sit there wondering if I should give them more money or just run my own error reporting tool.