/latent-mixer

Latent Space Sound Design Tool based on the VAE of stable-audio-open

Primary LanguageHTML

latent-mixer

Image of a blender

A sound design tool for creation of experimental music and latent space exploration.

License: MIT Twitter


Goals

My main goal for this tool is to provide a quick and easy way to mix 2 different samples to generate new and interesting sounds.

The tool allows you to interpolate two embeddings by using a weighted average between them. After that you can sequentially apply different transformations on the embedding (currently scaling, rotating and a nonlinear transform).

Screenshot of the tool

Running

Install deps

uv venv
source .venv/bin/activate
uv pip install -r requirements.txt

Download and extract the VAE checkpoint

shoutout to lyra for the recipe (her post on twitter)

from stable_audio_tools import get_pretrained_model
model, model_config = get_pretrained_model("stabilityai/stable-audio-open-1.0")
torch.save({"state_dict": model.pretransform.model.state_dict()}, "vae.ckpt")

Start the backend

fastapi dev main.py

open the backend running at http://localhost:8000


If you find this interesting, please consider: