kagis/pgwire

Row decoding

exe-dealer opened this issue · 3 comments

  • array text
  • array bin
  • range text
  • range bin

@exe-dealer do you have any plans to complete this work on your Deno port? I'm wondering if row decoding could be broken out into a separate module used by both libraries.

@jmealo I have no plans to add new decoders. Its not possible to cover all types anyway, because postgres allows to create custom types.

I usually query data as json like this

const { scalar } = await pg.query({
  statement: /*sql*/ `
    select json_agg(tup) from (
      select i, i^2 sqr 
      from generate_series(int4($1->>'from'), int4($1->>'to')) i
    ) tup;
  `,
  params: {
    type: 'json',
    value: {
      from: 1,
      to: 10,
    },
  },
});

assertEquals(scalar, [
  { i: 1, sqr:1 }, 
  { i: 2, sqr: 4 }, 
  { i: 3, sqr: 9 }, 
  { i: 4, sqr: 16 }, 
  { i: 5, sqr: 25 }, 
  { i: 6, sqr: 36 }, 
  { i: 7, sqr: 49 }, 
  { i: 8, sqr: 64 }, 
  { i: 9, sqr: 81 }, 
  { i: 10, sqr: 100 },
]);

This is more flexible because its possible to query arbitrary shaped json rather than just list of tuples.

I see 2 cases where querying tuples is usefull

  • querying big bytea's with binary decoder to avoid hex encoding overhead
  • streaming big datasets with many columns to avoid overhead of columns names serialization

First case is already supported and in second case imho its better to cast all columns to safe primitive types. I'd like to note that timestamp cannot be safely decoded to js Date because value will loose microseconds precision. I prefer to do all datetime calculation on postgres side and consume/send dates as text.

What types you need? Can you describe your cases?

jmealo commented

It's interesting you're handling this in the query. Previously, we handled this in jsoncdc (the logical decoding plugin), however, it relied on the same facilities, allowing PostgreSQL to cast things to JSON.

The issue became folks wanted to use hosted/managed Postgres and the cloud providers got to choose which logical decoding plugins were allowed, and despite my lobbying for jsoncdc, I don't believe anyone included that in their whitelisted system extensions. (Everyone opted for wal2json, which is less robust IMO).

My use case was to make a database agnostic CDC/binlog listener:
https://github.com/JarvusInnovations/lapidus