facebookresearch/faiss

GPU memory usage even after deleting the index

JoaoGuibs opened this issue · 5 comments

Summary

When using faiss on the GPU (flat index in this example), if we have an GPU OOM error, even after resetting and deleting the index, we still have some GPU memory being used (nvidia-smi screenshot below). Is this expected? If yes what are the reasons for this to happen?

Thanks in advance.

Platform

OS:

Faiss version: faiss-gpu, version 1.7.2

Installed from: pip

Faiss compilation options:

Running on:

  • CPU
  • GPU

Interface:

  • C++
  • Python

Reproduction instructions

While running the following code, the memory usage on the breakpoint is shown below:

import torch
import faiss

def create_index(data):
    dimension = data.shape[1]
    index_flat = faiss.IndexFlatL2(dimension)
    gpu_resources = faiss.StandardGpuResources()
    gpu_options = faiss.GpuClonerOptions()
    device = torch.device("cuda")
    device_index = device.index
    gpu_options.device = device_index if device_index is not None else 0
    
    index = faiss.index_cpu_to_gpu(
        gpu_resources, gpu_options.device, index_flat, gpu_options
    )

    return index

data = torch.zeros((2950000, 1000))
index = create_index(data)

try:
    index.add(data)
except Exception as e:
    print(e)
    index.reset()
    del index
    breakpoint()

image

Hi @JoaoGuibs, thank you for reaching out. I think it is expected and delete index itself can't reclaim the resource. You should clean up the torch.cuda. Here is an example I run and I could clean up the memory usage

import torch
import gc

gc.collect()
torch.cuda.empty_cache()

Before
Screenshot 2024-08-22 at 2 36 42 PM

After
Screenshot 2024-08-22 at 2 36 16 PM