Translated by New Bing
A local image search tool based on feature point matching
Mainly based on the following projects:
- ORB_SLAM3 - Solves the problem of overly concentrated feature points in traditional ORB algorithms
- faiss - Searches for large-scale vectors
- Install OpenCV, faiss
Note: When compiling faiss, it is recommended to set -DFAISS_OPT_LEVEL=avx2
to maximize performance
cargo install --git https://github.com/lolishinshi/imsearch
The first time you run it, you need to train the index based on the approximate number of images you need to add:
- 2k ~ 2w: K takes 65536, requires at least 5.2k images for training
- 2w ~ 20w: K takes 262144, requires at least 21k images for training
- 20w ~ 200w: K takes 1048576, requires at least 82k images for training
Then put the training images in the train folder and use imsearch add-images train
to add images
Then use imsearch export-data
to export train.npy
Then use python utils/train.py K train.npy
to train the index, The trained result will be saved in ~/.config/imsearch/index
Note: Training on large datasets is very time-consuming. When K = 1048576 and the number of training images is 100k, it took two 3080s 16 hours to complete the training.
Use imsearch add-images DIR
to add all images in the specified directory
Use imsearch build-index
to build the index, this process is also very slow, on a 3970x, it takes about 20~40 minutes to build an index for 10k images
Note: You can set RUST_LOG=debug
to print detailed logs to observe progress
# Let imsearch print detailed logs
export RUST_LOG=debug
# Search for a single image directly with default parameters
imsearch search-image test.jpg
# --mmap: No need to load the entire index into memory
# --nprobe=128: Search nearby 128 buckets, which improves accuracy but takes more time
imsearch --mmap --nprobe=128 search-image test.jpg
# Start the server and listen on port 127.0.0.1:8000
imsearch --mmap start-server
# Use httpie to search for images through web api
http --form http://127.0.0.1:8000/search file@test.jpg
Search time: For an index of 250w images, it takes about 0.5s to search once on a 3970x.