High Memory and CPU Overload on VPS During Bulk Updates with Nested Relations

Question

High Memory and CPU Overload on VPS During Bulk Updates with Nested Relations

Closed this issue 5 months ago · 1 comments

lxup commented 5 months ago

PGSync version: 3.2.0

Postgres version: 15.1

Elasticsearch/OpenSearch version: 7.17.23

Redis version: 7.2.5

Python version: 3.9

Problem Description:
I recently added PGSync to my docker-compose.yml for Supabase with the following setup:

pgsync-redis:
    container_name: pgsync-redis
    image: redis
    command: redis-server --requirepass ${PGSYNC_REDIS_PASSWORD}
    ports:
      - 6379:6379
  
pgsync:
    container_name: pgsync
    build: ./requirements/pgsync
    depends_on:
      db:
        condition: service_healthy
      pgsync-redis:
        condition: service_healthy
    environment:
      - PG_USER=supabase_admin
      - PG_PASSWORD=${POSTGRES_PASSWORD}
      - PG_HOST=${POSTGRES_HOST}
      - PG_PORT=${POSTGRES_PORT}
      - PG_DATABASE=${POSTGRES_DB}
      - REDIS_HOST=pgsync-redis
      - REDIS_PORT=6379
      - REDIS_AUTH=${PGSYNC_REDIS_PASSWORD}
      - ELASTICSEARCH_SCHEME=${ELASTICSEARCH_SCHEME}
      - ELASTICSEARCH_HOST=${ELASTICSEARCH_HOST}
      - ELASTICSEARCH_PORT=${ELASTICSEARCH_PORT}
      - ELASTICSEARCH_USER=${ELASTICSEARCH_USER}
      - ELASTICSEARCH_PASSWORD=${ELASTICSEARCH_PASSWORD}
      - ELASTICSEARCH=True
      - LOG_LEVEL=INFO

Everything works perfectly during initial synchronization with Elasticsearch, and updates to individual entries in my Postgres database are also synced correctly.

However, I encountered a severe issue when running a Python script that updates data for around 10,000 movies daily at 10 AM. Since adding PGSync to my stack, this bulk update caused my VPS to overload in terms of RAM usage, which I noticed through the extremely slow performance of my Supabase API and even SSH access to my VPS. This overload persisted for over 10 hours after the script had finished running.

Here are the relevant details:

Postgres contains two tables:
- tmdb_movie (columns like id, original_title, etc.)
- tmdb_movie_translation (linked to tmdb_movie with a movie_id foreign key)
PGSync configuration:

[
    {
        "database": "postgres",
        "index": "movies", 
        "mapping": {
            "translations": {
                "type": "nested"
            }
        },
        "nodes": {
            "table": "tmdb_movie",
            "schema": "public",
            "columns": [
                "id",
                "adult",
                "backdrop_path",
                "budget",
                "homepage",
                "imdb_id",
                "original_language",
                "original_title",
                "popularity",
                "release_date",
                "revenue",
                "runtime",
                "status",
                "vote_average",
                "vote_count",
                "collection_id"
            ],
            "children": [
                {
                    "table": "tmdb_movie_translation",
                    "columns": [
                        "title",
                        "overview",
                        "language_id"
                    ],
                    "label": "translations",
                    "relationship": {
                        "variant": "object",
                        "type": "one_to_many"
                    }
                }
            ]
        }
    }
]

The first synchronization goes smoothly, and individual updates are synced correctly. But when the bulk update occurs, it seems that the numerous triggers, combined with the nested relationship in tmdb_movie_translation, are causing a massive overload.

Observed behavior:

Last logs from PGSync were repeating the following message in a loop:

pgsync  | Sync postgres:movies Xlog: [0] => Db: [4] => Redis: [0] => Elasticsearch: [958,552]...

The overload continued even after stopping PGSync and restarting the database. Supabase containers were continuously overloaded in terms of CPU and RAM.

Possible Related Issues:
I suspect this might be related to how PGSync handles the large number of triggers during bulk updates, especially with nested relationships : #88

Any insights or solutions would be greatly appreciated! 😁

Answer 1 · 2024-08-20T07:40:03.000Z

Well I've upgrade my VPS with more vCPU and RAM and seems to work. I close this issue for now 😁