Greenplum Audio Semantic Search

This repository provides instructions and code for setting up an Audio/Music Semantic Search engine leveraging VMware Greenplum as Vector Database.

Streamlit App

Getting Started

Follow these steps to set up the audio search system on your local machine.

Prerequisites

Step 1: Create Database Tables

  1. Run the script.sql file to create tables for storing metadata and embeddings in your Greenplum database:

    $ psql -U your_username -d your_database -a -f script.sql

Step 2: Generate Embeddings

  1. Use the Audio_Semantic_Search.ipynb Notebook to download the dataset and generate embeddings into Greenplum.
  2. Install the required Python packages listed in requirements.txt.

Step 3: Build Docker Image

  1. Build your Docker image for the Greenplum audio search system:

    $ docker build -t greenplum-audio-search .

Step 4: Run Docker Container

  1. Run the Docker container for the audio search system:

    $ docker run -d -p 8501:8501 greenplum-audio-search

Step 5: Access the Web App

  1. Once the container is running, access the web application by opening a web browser and navigating to:

    http://localhost:8501