The aim of this work is to find similar images, mostly similar jean, tshirt, tv or sofa image from our dataset. We leveraged MongoDB Atlas VectorSearch feature to create the image search similarity system to retrieve the information. The dataset images have been converted into embeddings and hosted on a MongoDB Atlas cluster. For retrieval, the querying image will be first converted into embeddings then via MongoDB Atlas VectorSearch function, retrieve the top k images, where k=5 in our case. Cosine similarity is used for distance calculation. Enbeddings are generated via Vision Transformer (ViT) pretrained model.
The dataset used in this work has been downloaded from kaggle and is large of 796 images, divided into 4 classes: Jean, Tshirt, TV and Sofa.
- Transformers
- OS
- Pillow
- Requests
- Glob
- Matplotlib
- Numpy
- Dotenv
- PyMongo
- Load all images from your dataset and create their embeddings via a pretrained vision transformer model.
- Pair image_filename and corresponding embeddings into a dictionary and store in MongoDB Atlas database.
- Create search index in MongoDB Atlas (see below image) to be later used for the image retrieval.
- Load and create embeddings for the querying image then retrieve similar images
Below we have a set of retrieval results. Based in below tests, we can observe that the system combines both the object shape and color to retrieve the perfect match. It finds the exact match of the querying image.
- You need to create a .env file containing your MongoDB Atlas DB account credentials, also called connection string used to connect to your cluster.
- Adjust the paths in the code based on your local directory.