postgres version 16 onwards is consuming entire disk space within container (possible disk spill is happenning)
Closed this issue · 5 comments
I have created a stored procedure:
CREATE OR REPLACE PROCEDURE TEL1PrepUploadTables(IN Dataset text)
LANGUAGE PLPGSQL
AS \$\$
BEGIN
EXECUTE format('CREATE TABLE "add_cell_raw_%s" PARTITION OF add_cell_raw FOR VALUES IN (''%s'');',Dataset,Dataset);
EXECUTE format('CREATE TABLE "del_cell_id_raw_%s" PARTITION OF del_cell_id_raw FOR VALUES IN (''%s'');',Dataset,Dataset);
EXECUTE format('CREATE OR REPLACE VIEW add_cell_raw_latest AS
SELECT * FROM "add_cell_raw_%s";',Dataset);
EXECUTE format('CREATE OR REPLACE VIEW del_cell_id_raw_latest AS SELECT * FROM "del_cell_id_raw_%s";',
Dataset);
END;
\$\$;in this procedure there is a step to create or replace view add_cell_raw_latest...
which takes data from a raw table add_cell_raw_%s (actual value add_cell_raw_c169cwea01_export_LTECell_v4_20250208073005)...this raw table contains roughly 750 MB
TEL1_WEA_4G=# SELECT pg_size_pretty(pg_table_size('"add_cell_raw_c169cwea01_export_LTECell_v4_20250208073005"'));
pg_size_pretty
----------------
737 MB
(1 row)
Now when this CREATE VIEW runs internally postgres runs the following step to create the VIEW
TEL1_WEA_4G=# SELECT pid, query, state, wait_event, wait_event_type FROM pg_stat_activity WHERE state <> 'idle';
pid | query | state | wait_event | wait_event_type
-----+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------+------------+-----------------
173 | COPY "add_cell_raw_c169cwea01_export_LTECell_v4_20250208073005" ("Geometry", "dataset", "DateTime", "UploadTime", "mcc", "mnc", "cid", "cellType", "cellName", "antennaLat", "antennaLon", "antennaOrientation", "antennaOpening", "antennaHorizontalRange", "geoType") FROM STDIN; | active | |
Now the ISSUE is i have created a kubernetes container using
postgres images which are version 15 or lower and postgres images which are version 16 or higher (with volume allocated to container 50 GB)..
Observation
- when I am running container based on postgres image versions 15 or lower the space is being occupied in a reasonable way
wmps-postgres-deployment-b6465cb7b-rj9lw:/# df -h
Filesystem Size Used Available Use% Mounted on
overlay 154.9G 102.3G 52.5G 66% /
tmpfs 64.0M 0 64.0M 0% /dev
/dev/vda1 154.9G 102.3G 52.5G 66% /docker-entrypoint-initdb.d
/dev/vda1 154.9G 102.3G 52.5G 66% /etc/hosts
/dev/vda1 154.9G 102.3G 52.5G 66% /dev/termination-log
/dev/vda1 154.9G 102.3G 52.5G 66% /etc/hostname
/dev/vda1 154.9G 102.3G 52.5G 66% /etc/resolv.conf
shm 64.0M 1.0M 63.0M 2% /dev/shm
**kadalu:kadalu-storage-pool
50.0G 3.8G 46.2G 8% /var/lib/postgresql/data**
- HOWEVER same STEP running on postgres image versions 16 or higher leads to immediate disk usage consuming entire 50 GB diskspace ... corrupting the container making it unusable
wmps-postgres-deployment-b6465cb7b-rj9lw:/# df -h
Filesystem Size Used Available Use% Mounted on
overlay 154.9G 102.3G 52.5G 66% /
tmpfs 64.0M 0 64.0M 0% /dev
/dev/vda1 154.9G 102.3G 52.5G 66% /docker-entrypoint-initdb.d
/dev/vda1 154.9G 102.3G 52.5G 66% /etc/hosts
/dev/vda1 154.9G 102.3G 52.5G 66% /dev/termination-log
/dev/vda1 154.9G 102.3G 52.5G 66% /etc/hostname
/dev/vda1 154.9G 102.3G 52.5G 66% /etc/resolv.conf
shm 64.0M 1.0M 63.0M 2% /dev/shm
**kadalu:kadalu-storage-pool
50.0G 50G 0G 100% /var/lib/postgresql/data**
As I said this issue coming from postgres images 16 and onwards... I am reading some articles where to increase the query performance could lead to disk spill of the data in postgres... please look into it and fix it
Can you run \l+ in psql to see the size of the db?
> psql -c "\l+"
Name | Owner | Encoding | Locale Provider | Collate | Ctype | Locale | ICU Rules | Access privileges | Size | Tablespace | Description
----------------+----------+----------+-----------------+------------+------------+--------+-----------+-----------------------+---------+------------+--------------------------------------------
postgres | postgres | UTF8 | libc | en_US.utf8 | en_US.utf8 | | | | 7475 kB | pg_default | default administrative connection database
template0 | postgres | UTF8 | libc | en_US.utf8 | en_US.utf8 | | | =c/postgres +| 7321 kB | pg_default | unmodifiable empty database
| | | | | | | | postgres=CTc/postgres | | |
template1 | postgres | UTF8 | libc | en_US.utf8 | en_US.utf8 | | | =c/postgres +| 7321 kB | pg_default | default template for new databases
| | | | | | | | postgres=CTc/postgres | | |
(3 rows)
How many time has the corrupted db pod restarted?
Can you give this a try? https://stackoverflow.com/a/72740536
I waited
How many time has the corrupted db pod restarted?
I waited for couple of hours for it to restore... But it was in same state
