quickwit-oss/tantivy

mmap pages for deleted segments remaining in process RAM

Closed this issue · 1 comments

Describe the bug
We have a process that is constantly ingesting new documents for an index backed by an MmapDirectory and periodically calling commit() / reload() to commit data. We keep the Index, IndexReader, and IndexWriter live throughout the entire life of the process.

Over time we have noticed memory build up that we're trying to debug. Looking at the smaps stats for the process we see several mmaped files still open for segments that have been merged and are deleted from disk, like this:

7f8058009000-7f805800a000 r--s 00000000 fd:02 2296889088                 /tmp/partKeyIndex-prometheus-189-1727461018368-/de396e3ee2454fa69c8ff243258fe5dd.fast (deleted)
Size:                  4 kB
KernelPageSize:        4 kB
MMUPageSize:           4 kB
Rss:                   4 kB
Pss:                   4 kB

We see this list of mapped files build up over time as our process stays live. I'm expecting once a merge completes these should be reclaimed, but would like to confirm if this is correct or if we need to do anything to release these mmaped files.

From the few processes I've spot checked only the .fast files are the ones still in the map.

Which version of tantivy are you using?
0.22.0

To Reproduce

We don't have a minimal reproducer at the moment. I can work on producing one as needed. Our service is ingesting data fairly consistently and we see this behavior build up over days of uptime and a pretty steady but slow rate that likely coincides with the merge policies producing merged segments.

Not a bug - I found a place in our code where we're holding on to fast field Column objects longer than we should. Closing.