This repository provides instructions and code for setting up an Audio/Music Semantic Search engine leveraging VMware Greenplum as Vector Database.
Follow these steps to set up the audio search system on your local machine.
- Greenplum Database
- pgvector extension
- Kaggle API Access
- Docker
- Jupyter Notebook
- Python dependencies (requirements.txt)
-
Run the
script.sql
file to create tables for storing metadata and embeddings in your Greenplum database:$ psql -U your_username -d your_database -a -f script.sql
- Use the
Audio_Semantic_Search.ipynb
Notebook to download the dataset and generate embeddings into Greenplum. - Install the required Python packages listed in
requirements.txt
.
-
Build your Docker image for the Greenplum audio search system:
$ docker build -t greenplum-audio-search .
-
Run the Docker container for the audio search system:
$ docker run -d -p 8501:8501 greenplum-audio-search
-
Once the container is running, access the web application by opening a web browser and navigating to:
http://localhost:8501