IVFPQ-FastScan-Refine performance and parameters setting
zeke-cheung opened this issue · 1 comments
Summary
Platform
OS:
Faiss version:
Installed from:
Faiss compilation options:
Running on:
- CPU
Interface:
- Python
Reproduction instructions
import faiss
from faiss.contrib.evaluation import knn_intersection_measure
from faiss.contrib import datasets
ds = datasets.SyntheticDataset(64, 5000, 10000, 1000)
d = 64 # dimension
m = 8 # 8 specifies that the number of sub-vector is 8
k = 4 # number of dimension in etracted vector
n_bit = 4 # 4 specifies that each sub-vector is encoded as 4 bits
bbs = 32 # build block size ( bbs % 32 == 0 ) for PQ
nlist = 50
quantizer = faiss.IndexFlatL2(d) # this remains the same
index = faiss.IndexIVFPQFastScan(quantizer, d, nlist, m, n_bit, faiss.METRIC_L2, bbs)
index_refine = faiss.IndexRefineFlat(index)
assert not index_refine.is_trained
index_refine.train(ds.get_train()) # Train vectors data index within mockup database
assert index_refine.is_trained
index_refine.add(ds.get_database())
params = faiss.IndexRefineSearchParameters(k_factor=10)
D, I = index_refine.search(ds.get_queries(), 100, params=params,)
KIM = knn_intersection_measure(I, ds.get_groundtruth())
recall_at_100 = (I[:, :100] == ds.get_groundtruth()[:, :1]).sum() / 1000
print(recall_at_100)
print(I)
print(D)
Hello, I want to implement an index of IVFPQ+fastscan+refine, which is the same algorithm as ann-benchmark. The above script has a recall@100 of 0.38, and I don't know why it's so low. Could you help me, please?
In addition, I also have some minor problems with the parameter setting of IndexIVFPQFastScan. Should the first quantizer parameter simply use the default IndexFlatL2(d)? If I want to modify the quantized sub-space quantity m, can I directly modify it in the parameter of IndexIVFPQFastScan? Or do I need to define a ProductQuantizer instance with proper parameters that I pass in to IndexIVFPQFastScan?
Thank you for your time.
Thank you for providing the sample code!
by default nprobe for ivf is 1
I tried same code with different settings
nprobe = 1, recall = 0.38
nprobe = 2, recall = 0.565
nprobe = 4, recall = 0.729
nprobe = 8, recall = 0.888
Coarse quantizer in IVFPQFastScan is IVF. If you are looking to quantize coarse quantizer (which is IVF in IVFPQFastScan) itself, then you may choose to pass some other quantizer such as IndexHNSWFlat. If you are looking to change params for fine quantizer i.e. PQFS in IVFPQFastScan, then you can change that parameters in IndexIVFPQFastScan.
Let me know if this doesn't answer your question.