/multi_modal_retrieval_backyard_birds

Multimodal Retrieval with Text Embedding and CLIP Image Embedding for Backyard Birds

Primary LanguageJupyter NotebookApache License 2.0Apache-2.0

multi_modal_retrieval_backyard_birds

A little app I created for my daughter who loves birds. :-)

multimodal-backyard-birds.gif

Check out my Medium blog post for details. Multimodal Retrieval with Text Embedding and CLIP Image Embedding for Backyard Birds.

Implementation Steps

  • Step 1: Download backyard birds text and images
  • Step 2: Build text index for vector store and define text query engine
  • Step 3: Build image index for vector store using OpenAI CLIP embeddings
  • Step 4: Multimodal retrieval of both image and text for sample queries