postgres version 16 onwards is consuming entire disk space within container (possible disk spill is happenning)

Question

postgres version 16 onwards is consuming entire disk space within container (possible disk spill is happenning)

Closed this issue 9 months ago · 5 comments

I have created a stored procedure:

CREATE OR REPLACE PROCEDURE TEL1PrepUploadTables(IN Dataset text)
    LANGUAGE PLPGSQL
    AS \$\$
    BEGIN
        EXECUTE format('CREATE TABLE "add_cell_raw_%s" PARTITION OF add_cell_raw FOR VALUES IN (''%s'');',Dataset,Dataset);

        EXECUTE format('CREATE TABLE "del_cell_id_raw_%s" PARTITION OF del_cell_id_raw FOR VALUES IN (''%s'');',Dataset,Dataset);

        EXECUTE format('CREATE OR REPLACE VIEW add_cell_raw_latest AS
        SELECT * FROM "add_cell_raw_%s";',Dataset);

        EXECUTE format('CREATE OR REPLACE VIEW del_cell_id_raw_latest AS SELECT * FROM "del_cell_id_raw_%s";',
            Dataset);
    END;
    \$\$;

in this procedure there is a step to create or replace view add_cell_raw_latest...
which takes data from a raw table add_cell_raw_%s (actual value add_cell_raw_c169cwea01_export_LTECell_v4_20250208073005)...this raw table contains roughly 750 MB

TEL1_WEA_4G=# SELECT pg_size_pretty(pg_table_size('"add_cell_raw_c169cwea01_export_LTECell_v4_20250208073005"'));
 pg_size_pretty
----------------
 737 MB
(1 row)

Now when this CREATE VIEW runs internally postgres runs the following step to create the VIEW

TEL1_WEA_4G=# SELECT pid, query, state, wait_event, wait_event_type FROM pg_stat_activity WHERE state <> 'idle';
 pid |                                                                                                                                        query                                                                                                                                        | state  | wait_event | wait_event_type
-----+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------+------------+-----------------
 173 | COPY "add_cell_raw_c169cwea01_export_LTECell_v4_20250208073005" ("Geometry", "dataset", "DateTime", "UploadTime", "mcc", "mnc", "cid", "cellType", "cellName", "antennaLat", "antennaLon", "antennaOrientation", "antennaOpening", "antennaHorizontalRange", "geoType") FROM STDIN; | active |            |

Now the ISSUE is i have created a kubernetes container using
postgres images which are version 15 or lower and postgres images which are version 16 or higher (with volume allocated to container 50 GB)..

Observation

when I am running container based on postgres image versions 15 or lower the space is being occupied in a reasonable way

wmps-postgres-deployment-b6465cb7b-rj9lw:/# df -h
Filesystem                Size      Used Available Use% Mounted on
overlay                 154.9G    102.3G     52.5G  66% /
tmpfs                    64.0M         0     64.0M   0% /dev
/dev/vda1               154.9G    102.3G     52.5G  66% /docker-entrypoint-initdb.d
/dev/vda1               154.9G    102.3G     52.5G  66% /etc/hosts
/dev/vda1               154.9G    102.3G     52.5G  66% /dev/termination-log
/dev/vda1               154.9G    102.3G     52.5G  66% /etc/hostname
/dev/vda1               154.9G    102.3G     52.5G  66% /etc/resolv.conf
shm                      64.0M      1.0M     63.0M   2% /dev/shm
**kadalu:kadalu-storage-pool
                         50.0G      3.8G     46.2G   8% /var/lib/postgresql/data**

HOWEVER same STEP running on postgres image versions 16 or higher leads to immediate disk usage consuming entire 50 GB diskspace ... corrupting the container making it unusable

wmps-postgres-deployment-b6465cb7b-rj9lw:/# df -h
Filesystem                Size      Used Available Use% Mounted on
overlay                 154.9G    102.3G     52.5G  66% /
tmpfs                    64.0M         0     64.0M   0% /dev
/dev/vda1               154.9G    102.3G     52.5G  66% /docker-entrypoint-initdb.d
/dev/vda1               154.9G    102.3G     52.5G  66% /etc/hosts
/dev/vda1               154.9G    102.3G     52.5G  66% /dev/termination-log
/dev/vda1               154.9G    102.3G     52.5G  66% /etc/hostname
/dev/vda1               154.9G    102.3G     52.5G  66% /etc/resolv.conf
shm                      64.0M      1.0M     63.0M   2% /dev/shm
**kadalu:kadalu-storage-pool
                         50.0G      50G     0G   100% /var/lib/postgresql/data**

As I said this issue coming from postgres images 16 and onwards... I am reading some articles where to increase the query performance could lead to disk spill of the data in postgres... please look into it and fix it

Answer 1 · 2025-02-17T15:13:27.000Z

Can you run \l+ in psql to see the size of the db?

> psql -c "\l+"
      Name      |  Owner   | Encoding | Locale Provider |  Collate   |   Ctype    | Locale | ICU Rules |   Access privileges   |  Size   | Tablespace |                Description                 
----------------+----------+----------+-----------------+------------+------------+--------+-----------+-----------------------+---------+------------+--------------------------------------------
 postgres       | postgres | UTF8     | libc            | en_US.utf8 | en_US.utf8 |        |           |                       | 7475 kB | pg_default | default administrative connection database
 template0      | postgres | UTF8     | libc            | en_US.utf8 | en_US.utf8 |        |           | =c/postgres          +| 7321 kB | pg_default | unmodifiable empty database
                |          |          |                 |            |            |        |           | postgres=CTc/postgres |         |            | 
 template1      | postgres | UTF8     | libc            | en_US.utf8 | en_US.utf8 |        |           | =c/postgres          +| 7321 kB | pg_default | default template for new databases
                |          |          |                 |            |            |        |           | postgres=CTc/postgres |         |            | 
(3 rows)

Answer 2 · 2025-02-17T19:24:32.000Z

This is the screen shot of version 17 postgres image... but the database is still in healthy state... i can't give you the same data when it gets corrupted as when the container goes bad and i cannot run this command.

Answer 3 · 2025-02-17T19:34:59.000Z

How many time has the corrupted db pod restarted?

Answer 4 · 2025-02-17T19:38:57.000Z

Can you give this a try? https://stackoverflow.com/a/72740536

Answer 5 · 2025-02-17T19:40:00.000Z

I waited

How many time has the corrupted db pod restarted?

I waited for couple of hours for it to restore... But it was in same state