brianc/node-pg-query-stream

Cursor not being closed when stream ends?

lebolo opened this issue · 3 comments

When I use pg-stream, the QueryStream.close function never gets called.

I'm using pg-stream within a deferred function:

var pg = require('pg');
var Q = require('q');
var QueryStream = require('pg-query-stream');

var sql = ...;
var connStr = ...;

pg.connect(connStr, function(err, client, done) {

    var deferred = Q.defer();

    var qstream = new QueryStream(sql.text, sql.values, {highWaterMark: 10000, batchSize: 10000});
    var stream = client.query(qstream);

    stream.on('end', function() {
        //console.log('Manually closing cursor');
        //qstream.close();
        done();
    });

    stream.on('error', function(err) {
        // An error occurred, return the client to the connection pool
        done();
    });

    deferred.resolve(stream);
});

Then I call that function and use the resolved stream:

getStream()
    .then(function(stream) {

        stream.on('end', function() {
            console.log('pg-stream end!');
        });

        var jsoncsvstream = ...some json to csv transformer...

        jsoncsvstream.on('end', function() {
            console.log('jsoncsv end!');
        });

        res.on('finish', function() {
            console.log('response finish!');
        });

        stream
            .pipe(jsoncsvstream)
            .pipe(res);
        }
    });

I first noticed this because I would get a huge memory footprint when I used the stream. I placed some console.log messages within QueryStream.close and never saw them (even though I see pg-stream, jsoncsv and response end messages).

My workaround is to explicitly call QueryStream.close on the stream's end event (commented out above). But is this expected?

P.S.: @brianc thanks for all the great pg libraries!

Could it be that only one stream.on("end"... is allowed, so that your overridden event handled disables the one that calls done()?

@lebolo Closing the cursor means closing the connection associated with it. Sometimes you want to keep the connection alive so other queries take place. I believe that's the reason QueryStream#close is meant to be called manually.

Also, please notice that pg manages its own connection pool and certainly there's a time limit where idle connections are closed automatically and removed from the pool. So I believe pg or pg-query-stream are not causing you a huge memory footprint.