guyyariv/AudioToken

This repo contains the official PyTorch implementation of AudioToken: Adaptation of Text-Conditioned Diffusion Models for Audio-to-Image Generation

PythonMIT

Issues

Evaluation metrics
#10 opened 3 months ago by darius522
0
Pretrained model(s)
#9 opened 3 months ago by darius522
5
Problem with FP16
#8 opened 8 months ago by arielkantorovich
0
How to use "lora_layers_learned_embeds.bin" inference.py
#6 opened 8 months ago by DthdZK
1
Publish embedder_learned_embeds.bin for SD 2 (Feature/Request)
#7 opened 8 months ago by arielkantorovich
1
I cannot find any audio files of the VGGSound dataset
#5 opened 9 months ago by DthdZK
3
Some details about how to inference
#4 opened 9 months ago by DthdZK
2
Speech and Image Embeddings
#3 opened 10 months ago by lokesh12345678910
1
"test_data_dir" in inference
#2 opened a year ago by AndyCA111
1
[BUG] safetensors_rust.SafetensorError: Error while deserializing header: HeaderTooLarge
#1 opened a year ago by ZeyueT
3