toluaina/pgsync

memory leak on long sync

Opened this issue · 4 comments

PGSync version: 2.5.0

Postgres version: 14

Elasticsearch version: 8.8.2

Redis version: 7

Python version: 3.11.4

Problem Description:
hi @toluaina
I have a table with lakhs of entries. During profiling the sync process, I see the memory leak reaching to a state where it eats up whole 32 GB RAM shutting down the whole sync.

Error Message (if any):



  • was this during the initial sync?
  • how big is the overall database?

Yes initial sync, Overall Db is 150GB+

Couple of observations:

  1. When setting thread_count to 1 seems to reduce the rate at which memory is leaked.
  2. Even still I see parallel_bulk of elasticsearch creates extra threads. And this seems to be leaking memory as well
    If you look at the below image the virtual memory footprint keeps on increasing of each thread created by elasticsearch lib
    image

Any success with the initial sync @accelq? I have a similar case with 1TB of data