brianc/node-pg-copy-streams

Error handling is incomplete in Postgres 11+

ab-pm opened this issue · 11 comments

ab-pm commented

I'm doing a copy command inside a transaction. When the copy command has a syntax error, the commit just hangs completely:

await pgClient.query('BEGIN;');
try {
	const outStream = pgClient.query(pgCopy.to(`COPY (oops) TO STDOUT DELIMITER ',' CSV HEADER;`));
	const writeStream = fs.createWriteStream('output.csv');
	await promisify(stream.pipeline)(outStream, writeStream);
} catch(err) {
	console.log('caught', err); // syntax error at or near "oops" - this runs as expected
	throw err;
} finally {
	// then the finally clause is executed
	await pgClient.query('COMMIT;');
	// but it never reaches here, pending forever
}

When I am not using pg-copy-streams, but just pgClient.query('oops'), the COMMIT statement is executed fine and the promise returned by my asynchronous function is rejected.
Is this a fundamental problem with the postgres COPY command? Is pgCopy.to leaving the pgClient in an invalid state when receiving an error?

Hello,

Thanks for the report. I have always used auto-commit with this module so cannot answer right away what the issue might be here (fundamental problem with the postgres COPY command or issue in the module)

I tried reproducing your issue in a test but at this stage it works for me so I don't know how to go forward. The COMMIT statement is executed and the promise is rejected.

it('correctly handle BEGIN/COMMIT transaction #113', async () => {
      const client = new pg.Client()
      await client.connect()
      try {
        await client.query('BEGIN')
        const outStream = client.query(copy(`COPY (SELECT INVALID SYNTAX) TO STDOUT`))
        await promisify(pipeline)(outStream, concat(()=>{}))
      } catch(err) {
        throw err
      }
      finally {
        await client.query('COMMIT')
        await client.end()
      }
    })

can you try on your side to add a test to test/copy-to.js that would highlight the problem or see what could be different in our approaches ? I also tried with const writeStream = fs.createWriteStream('output.csv'); but it does not bring the problem to me either.

I am testing with pg-copy-stream 5.1.1 and node 10.21.0

ab-pm commented

OK, thanks for the quick response, I'm going to write a more complete demo (and investigate other possible sources) if you cannot reproduce the issue.

ab-pm commented

I can produce the issue with the exact code you posted. I cloned pg-copy-stream from master, using node v10.21.0, and PostgreSQL 12.2. I suspect the latter is the problem!

console.log((await client.query('SELECT version()')).rows[0].version)

I also tried with PostgreSQL 11.8 (Ubuntu 11.8-1.pgdg18.04+1) which I happened to have installed, and it fails there as well:

Error: Timeout of 2000ms exceeded.

hmm that is weird and I would be very surprised if the postgres version would make a difference here.
As far as I know the COPY protocol has not changed and even the pg protocol (protocol v3) has not changed in these versions.
I currently test with postgres 9.6.1 ; travis is setup on 9.6.x. Is the 9.6.x branch available for you ?

As a quick check, I am going to temporarily setup travis on postgres 11.8 we will see what happens with the test

ab-pm commented

Hm, I did test against a docker run --rm postgres:9.6 and (ignoring the issues with the replication slot) it does fail there as well. That's very weird. Will have to continue debugging this tomorrow.

PS: the issue is not related to transactions. Any subsequent query seems to time out, not just a COMMIT.

PPS: Whew, it's not just me.

no the test is working ; I had just commited a failing test to do a fast test this is now fixed in 8837dd4#diff-363c2de3e56b9f61b2f37be83e045144 : the test is passing and I have no idea why it does not work for you.

I could not setup travis on postgres 11.8 - it is not available by default on travis and I do not have time to set this up now.

you should be able to hide the issues regarding the replication slot by removing test/copy-both.js

I really would be surprised if this is a postgres issue so maybe you can test with 9.6.

PS: the issue is not related to transactions. Any subsequent query seems to time out, not just a COMMIT.

This is strange. One thing to note is that if the outStream data is not pulled then error sent by postgres will never reach pg.

Could it be that in your setup the data is not pulled ? you could add a console.log(chunk) at the beginning of copy-to.js _parse to see if a message is received beginning with byte 45 which is the ErrorResponse sent by postgres.

you can test with

PGPORT=5432 PGDATABASE=postgres npx mocha --grep '#113'

I'll be afk and hope you can find some clues !

ab-pm commented

no the test is working ; I had just commited a failing test

But why did it time out instead of failing the test with an exception? (But I see your point. And the log I linked is from node 8, which doesn't have pipeline).
Also I sometimes get an error and sometimes a timeout, depending on the position of the code… A real Heisenbug.

ab-pm commented

I can not believe it: we were using pg-copy-streams 2.2.2 in our code base. I updated to 5.1.1 and the issues are gone. Ouch!

I can produce the issue with the exact code you posted.

Well, no, I can't. I (sometimes…) forgot to import promisify, so indeed an exception was thrown (and foolishly ignored, not checked, as done properly in your committed test) before the stream was consumed. The query - any query - in the finally clause never finished then.

OK. there have been many modifications since 2.2.2 so that could explain the problem.
I am glad that the issue is now gone.
I managed to add postgresql 11 and postgresql 12 to the travis test matrix maybe that will be helpful one day !