facebookresearch/faiss

Python: GpuIndexIVFFlat with externally provided GpuIndexFlat quantizer crashes if quantizer goes out of scope

Closed this issue · 2 comments

Summary

While experimenting with IVFFlat indexes in Python, I noticed what I believe to be a bug in the GPU implementation.
In short, if you pass an index object as the quantizer in the constructor call to GpuIndexIVFFlat and that
object goes out of scope or is manually deld, Faiss crashes (SIGSEGV) when trying to use the index.

The CPU implementation does not show this behavior (works despite coarse quantizer being deld).

Platform

OS: Ubuntu 22.04

Faiss version: 1.9.0, but applies to a custom fork based off 1.7.4 as well where I have discovered it, so the issue is probably older.

Installed from: anaconda

Faiss compilation options:

Running on:

  • CPU
  • GPU

Interface:

  • C++
  • Python

Reproduction instructions

CPU case that works robustly

import numpy as np
import faiss

dim = 10
nv = 1000

db = np.random.rand(nv, dim)

idx_coarse = faiss.IndexFlat(dim, faiss.METRIC_L2)

idx = faiss.IndexIVFFlat(idx_coarse, dim, faiss.METRIC_L2)

del(idx_coarse)  # delete the coarse quantizer

idx.train(db)  # no problem

# del(idx_coarse)  # if we do it here, also no problem

idx.add(db)

# del(idx_coarse)  # if we do it here, also no problem

idx.search(db, 1)

On the GPU, stuff breaks:

import numpy as np
import faiss

dim = 10
nv = 1000

db = np.random.rand(nv, dim)

res = faiss.StandardGpuResources()

idx_coarse = faiss.GpuIndexFlat(res, dim, faiss.METRIC_L2)

idx = faiss.GpuIndexIVFFlat(res, idx_coarse, dim, faiss.METRIC_L2)

del(idx_coarse)  # deletion site (1)

idx.train(db)  # BOOM(1) assertion error. Consistency check failure that quantizer's `d` != index's `d` due to being undefined values in quantizer's d field.

# del(idx_coarse)  # if we delete here (2)

idx.add(db)  # BOOM (2) SIGSEGV

del(idx_coarse)  # if we delete here (3)

# idx.search(db, 1)  # BOOM (3) SIGSEGV

Seems plausible, GpuIndexIVFFlat does not add the ref in the constructor as done for the CPU indexes

https://github.com/facebookresearch/faiss/blob/main/faiss/python/__init__.py#L162

Should be fixed