PostHog/plugin-repository

Plugin request: Postgres

timgl opened this issue · 7 comments

timgl commented

I want to dump all my people and events into a postgres database so I can use metabase to do queries.

Was this plugin requested by someone? What's the urgency? Will a posthog -> posthog export be good enough, like we have now with the replicator plugin?

The trouble is, if we expose pg for a plugin to use, we might be opening ourselves up for security issues. The connections will be made from our VPC, so they could theoretically find their way towards sensitive data.

Are there any ways we can get around this without seriously complicating the network setup? Is this something we should be worried about?

CC @macobo @fuziontech

timgl commented

@mariusandra Yeah it was requested by someone using cloud but wanting to do their own analysis on Metabase. Haven't heard from them in a while so probably not top urgency (esp as our focus is shifting to self hosted).

Will a posthog -> posthog export be good enough, like we have now with the replicator plugin?

I think something that works out of the box would be nicer but this could work for now if anyone asks.

I would say the best way for them to load postgres would be via loading data from an s3 dump. Having that gap would be more secure and would be less likely to get shoved over from volume.

As for metabase - the best suggestion is for them to spin up a small redshift cluster and have them wrap the data in s3 using that. Loading data to PG is just a bad pattern IMO.

Ok so I've now ran into this barrier. I’m building a Redshift plugin and was wanting to leverage pg to access it.

There's probably some way to get it done via HTTP for Redshift, but not for any random Postgres instance I'd assume. So it'd be great to be able to use the package.

Of course as a general rule an S3 dump might be best (also could then use COPY instead of INSERT) but I'd love to find ways to make plugins easy to use (i.e. you don't need another service just to export your data).

The trouble is, if we expose pg for a plugin to use, we might be opening ourselves up for security issues. The connections will be made from our VPC, so they could theoretically find their way towards sensitive data.

This already applies for fetch - e.g. clickhouse exposes a HTTP api which can be used for evil things there. Exposing pg does not change that equation.

Ah, well, people aren't dumped yet. Exporting people is something we've been discussing how to do.