postgresml/pgcat

Question: uuid support for sharding key

Opened this issue · 1 comments

I'm currently in the process of evaluating pgcat for use at my company. We have a requirement to use a random generated value for our id columns. The current plan for this is to use the native uuid type in postgres. Ideally we'd be able to generate these ids on the client, and use them as the sharding key passed to pgcat. PARTITION BY HASH (uuid) appears to work on the postgres end of things. Just need to figure out the pgcat side of things. I'm guessing that at a minimum the key type would need to change from i64 to i128 everywhere, and then additional parsing in the SET SHARDING KEY TO path to recognize uuids. Both of these might have performance implications.

Any chance you could give me an estimate of how complicated this would be to add to pgcat? My initial assessment is that I could likely implement this myself even though I don't have a lot of rust experience. Does this sound right to you? Is this a change you'd like to see in pgcat?

levkk commented

Hi there.

Your assessment is correct, we just need to implement the PARTITION BY HASH (uuid) inside PgCat, which should very doable because our sharding logic is just a Rust rewrite of the Postgres hashing logic.

PgCat supports multiple hashing functions, so we don't need to change the existing ones, we just need to write another one, e.g. pg_uuid_hash. See here: https://github.com/levkk/pgcat/blob/master/src/sharding.rs#L45

As for the interface that currently only accepts i64, we could make it generic perhaps? Or maybe introduce another shard_uuid and call it conditionally? I'm open to suggestions.

Overall, I don't think this would be difficult, as long as you're okay with implementing the UUID hashing function that's in Postgres in Rust. There are some tests you could write as well (see test module in sharding.rs) to validate that it works well.

A PR is always welcome to continue the conversation over code. Let me know if I can help!