PostHog/plugin-server

Stream console.log output somewhere

mariusandra opened this issue · 6 comments

To improve the dev experience, running console.log (and .info, etc) inside the plugin should send this information somewhere.

First thoughts:

  • Do console.log = (...args) => (posthog.capture(`${pluginName} console.log`, { args })? Too easy to endlessly loop...
  • Something on plugin_config in postgres? Too high throughput...
  • Redis?
  • Kafka?
  • Custom nodejs logger service that remembers the last 100 console.log messages per plugin and just periodically flushes the last state to postgres? How to thandle multithreading and multiple worker instances?
  • ???

TBH Kafka or ClickHouse would be best (in practice just Kafka would do)… But that's EE only.

The problem we're trying to solve is twofold:

  1. A lot of noise in the server logs from random plugins that adds little value
  2. It might be useful to see the console logs in the interface, to check how your plugin is doing (e.g. when did it last sync, etc)

Here's a rough proposal to capture/stream logs without introducing any new technologies:

  • Create a new postgres model PluginConfigLog(plugin_config_id: number, plugin_server_instance_random_unique_id: string, log: jsonb[]), where log = { time: 'ISO', type: 'info', message: 'bla' }[]
  • In each plugin server instance, in each VM, have the console.* methods save the last N = 250 lines (and/or last T0 = 48h?) of the log in a simple buffer (array?)
  • Every T1 = 10 seconds, if needed, upsert the entire buffer of N lines into the PluginConfigLog row (one row per plugin per server instance), discarding all the log entries that were there before
  • In the frontend, have a super simple log viewer that selects all the relevant PluginConfigLogs for this plugin and merges the log lines together... or even shows them per server if needed
  • Have a cleanup script that periodically deletes all rows older than T0.

Could this work and be simple enough? Or is this over-engineering?

I'd like to avoid writing a new row into postgres per console.log call, as I fear that a nasty plugin that runs thousands of times per second will cause too much unnecessary I/O. This way if there's anything to write, the amount that we write is controlled (at most N = 250 times ~100 bytes per line = 25kb per insert/update).

This would definitely work, though sounds kind of dirty… Lots of data handholding to ensure reasonable integrity.
I guess we can't make logs a Cloud/EE feature? Kafka+ClickHouse is a pretty great solution for logs of almost unlimited scale…

I think we should still have some basic logs on OSS as well. So I'd start with writing something very basic and if needed, we can develop a good EE logging solution after.

From a customer slack chat, it would be great if the plugin log also contains "debug" events like "Plugin loaded", just to be reassured that everything is working as it should.

In addition to that, it might be cool to keep track of how many events a plugin has operated on and show that somewhere in the interface...

Closing since this is only awaiting deployment now.