psycopg/psycopg2

`invalid message length` error on Postgres 14 when INSERT query is larger than 1GB

jabadia opened this issue · 2 comments

Please complete the following information:

  • OS: Ubuntu 20 (server and client). Also observed on MacOSX
  • Psycopg version: psycopg2.__version__ = 2.9.3 (dt dec pq3 ext lo64) psycopg2.__libpq_version__ = 140001
  • Python version: Python 3.7.11 (also reproduced with Python 3.9)
  • PostgreSQL version: PostgreSQL 14.2 (Ubuntu 14.2-1.pgdg20.04+1+b1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0, 64-bit
  • pip version: pip 22.0.4

Describe the bug
1: what you did

I used cursor.execute(query) with an INSERT query in the form INSERT INTO table (field1, field2) VALUES (v1, v2), (v1, v2) ... where the query string is larger than 1Gb.

2: what you expected to happen

The insertion to succeed.

3: what happened instead

The insertion failed. On the client, this exception was thrown

Traceback (most recent call last):
  File "/Users/xxxxxx/workdir/debug_db_error.py", line 91, in <module>
    main(count)
  File "/Users/xxxxxx/workdir/debug_db_error.py", line 70, in main
    cursor.execute(query)
psycopg2.OperationalError: server closed the connection unexpectedly
	This probably means the server terminated abnormally
	before or while processing the request.

On the server, this was written to PostgreSQL log:

2022-05-25 17:22:19.246 CEST [57460] LOG:  invalid message length

To me, it seems related to this commit in libpq: postgres/postgres@9626325#diff-57bc40d2e130923de29e9935ab37b7db36fb844a0b55df0662efe424929ecf4dR1233

If possible, provide a script reproducing the issue: It's just one cursor.execute(query) where query is a string with an INSERT statement larger than 1Gb.

NOTE: this might be a bug inside the underlying libpq. I will be happy to file a bug there if appropriate.

It seems that you have just reached the 1Gb server allocation limit.

https://doxygen.postgresql.org/memutils_8h.html#a74a92b981e9b6aa591c5fbb24efd1dac

It's not a bug in psycopg, libpq, or postgres: it's just a limit. You seem you are composing a query too large. Just execute smaller queries. You don't have a single value that is larger than 1Gb, right?

You might also want to try out psycopg 3 and use copy(), which would be more efficient. In psycopg 3, however, query and parameters are sent separately, so it might be easier to work around the limit even without using copy.

Yes, we are using Django and it is quite straightforward to batch the inserts.

To me, "message size" is an internal detail and, as a user of the library, I should not be concerned about the implementation details such as max allowed size.

I would expect the client side of the library to either:

  • support large queries and then handle the splitting of one query into multiple messages if needed
  • or don't support large queries (seems totally reasonable in this case), and then document that limit, and possibly detect larger queries in the client library and give a meaningful/useful message to client code

In this case, we had code that had been running happily for years, that started failing a few days after upgrading the server to pg14 and it took a lot of digging to see that the problem was a new max size somewhere inside the server (and what was the max allowed value)

thanks anyway!! (we love psycopg2, great piece of sw)