facebookresearch/faiss

IVFPQ-FastScan-Refine performance and parameters setting

zeke-cheung opened this issue · 1 comments

Summary

Platform

OS:

Faiss version:

Installed from:

Faiss compilation options:

Running on:

  • CPU

Interface:

  • Python

Reproduction instructions

import faiss

from faiss.contrib.evaluation import knn_intersection_measure
from faiss.contrib import datasets

ds = datasets.SyntheticDataset(64, 5000, 10000, 1000)
d = 64                           # dimension


m = 8   # 8 specifies that the number of sub-vector is 8
k = 4   # number of dimension in etracted vector
n_bit = 4   # 4 specifies that each sub-vector is encoded as 4 bits
bbs = 32    # build block size ( bbs % 32 == 0 ) for PQ

nlist = 50

quantizer = faiss.IndexFlatL2(d)  # this remains the same
index = faiss.IndexIVFPQFastScan(quantizer, d, nlist, m, n_bit, faiss.METRIC_L2, bbs)


index_refine = faiss.IndexRefineFlat(index)
assert not index_refine.is_trained
index_refine.train(ds.get_train())  # Train vectors data index within mockup database
assert index_refine.is_trained
index_refine.add(ds.get_database())

params = faiss.IndexRefineSearchParameters(k_factor=10)
D, I = index_refine.search(ds.get_queries(), 100, params=params,)
KIM = knn_intersection_measure(I, ds.get_groundtruth())
recall_at_100 = (I[:, :100] == ds.get_groundtruth()[:, :1]).sum() / 1000
print(recall_at_100)
print(I)
print(D)

Hello, I want to implement an index of IVFPQ+fastscan+refine, which is the same algorithm as ann-benchmark. The above script has a recall@100 of 0.38, and I don't know why it's so low. Could you help me, please?
In addition, I also have some minor problems with the parameter setting of IndexIVFPQFastScan. Should the first quantizer parameter simply use the default IndexFlatL2(d)? If I want to modify the quantized sub-space quantity m, can I directly modify it in the parameter of IndexIVFPQFastScan? Or do I need to define a ProductQuantizer instance with proper parameters that I pass in to IndexIVFPQFastScan?
Thank you for your time.

Thank you for providing the sample code!

by default nprobe for ivf is 1
I tried same code with different settings
nprobe = 1, recall = 0.38
nprobe = 2, recall = 0.565
nprobe = 4, recall = 0.729
nprobe = 8, recall = 0.888

Coarse quantizer in IVFPQFastScan is IVF. If you are looking to quantize coarse quantizer (which is IVF in IVFPQFastScan) itself, then you may choose to pass some other quantizer such as IndexHNSWFlat. If you are looking to change params for fine quantizer i.e. PQFS in IVFPQFastScan, then you can change that parameters in IndexIVFPQFastScan.

Let me know if this doesn't answer your question.