Experiment with accuracy of similarity searches on a dataset with known similar and dissimilar content. View the post in similarity_exp.md (or Medium) and the Jupyter notebook similarity.ipynb.
Part 2 in similarity2.ipynb uses a vector database (Pinecone) to store embeddings ("all-MiniLM-L6-v2" and "text-embedding-ada-002") and perform similarity searches using different metrics. View the associated blog post.