The goal of this project is to create an interface using Gradio and the AudioLDM model hosted on Huggingface, allowing users to generate .wav audio files via prompt and download the file.
- Create a process to generate the content
- Convert the generated content to a known format, in this case, .wav
- Build the interface that receives the prompt and the file name
The entire process was developed in Google Colab and then transferred to HuggingFace Spaces for deployment. The programming language used is Python.
Install the following packages in Python before running the code:
torch
Transformers
diffusers
gradio
scipy