archinetai/audio-diffusion-pytorch

Audio generation using diffusion models, in PyTorch.

PythonMIT

Issues

NaN after training for a while
#52 opened a year ago by jameshball
10
Weird spikes in the loss
#84 opened a month ago by return-nihil
0
RuntimeError: The size of tensor a (37) must match the size of tensor b (36) at non-singleton dimension 2
#81 opened 5 months ago by heury
4
RuntimeError: The size of tensor a (91) must match the size of tensor b (90) at non-singleton dimension 2
#83 opened 3 months ago by erikqu
0
Unconditional Generation generates noise
#82 opened 3 months ago by reachomk
0
Model architectures from the paper
#66 opened a year ago by AI-Guru
4
Unconditional model generates okay quality of fake human voice but failed on music.
#80 opened 8 months ago by piobmx
4
Questions about conditional generation
#61 opened a year ago by AI-Guru
2
What is the structure of encoder in diffusionAE?
#78 opened 8 months ago by SuperiorDtj
1
CUDA OF Memory for 80GB A100 : follow the mousai paper setting of text condition
#79 opened 8 months ago by SuperiorDtj
0
I have a few questions about 1D-UNet
#76 opened 10 months ago by 0417keito
4
Class-conditional generation
#72 opened a year ago by aqibsaeed
1
AssertionError: ClassiferFreeGuidancePlugin requires embedding
#71 opened a year ago by gg4u
0
Can the repo be used to process MIDI data？
#69 opened a year ago by zsy1987
0
Future Work - Models
#67 opened a year ago by AI-Guru
0
Languages
#64 opened a year ago by barredo
1
Trained models
#63 opened a year ago by aoezis
0
New Try
#60 opened a year ago by Leezp99
0
could provide a example recipe?
#51 opened a year ago by gandolfxu
12
Add support to clip predicted samples to the desired range.
#55 opened a year ago by Kinyugo
2
How to just train condition audio-diffusion without text-condition?
#49 opened a year ago by LeonJoe13
3
Spectrogram-based diffusion model
#59 opened a year ago by Tinglok
2
Alternative Noises: Offset, Pyramid, Pink
#56 opened a year ago by torridgristle
2
What loss function is being used?
#53 opened a year ago by kitchWWW
2
How to use our own background noisy dataset to generate sample?
#48 opened a year ago by haloha123
1
Question: the sigma_t is not samped from 0 to 1 in v-diffusion, which is not like your thesis mentioned, will it cause any trouble?
#50 opened a year ago by emailandxu
1
VRAM requirements?
#44 opened a year ago by illtellyoulater
3
Typo in Paper
#45 opened a year ago by hu-po
1
How to convert to wav file to listen to result?
#46 opened a year ago by dillfrescott
6
Support usage with non-audio data e.g spectrograms
#39 opened a year ago by Kinyugo
7
can not open music examples websit
#47 opened a year ago by Liujingxiu23
2
Am I training the model correctly?
#33 opened 2 years ago by cat-policlot
2
example for Unconditional Generator fails
#43 opened a year ago by MultiTrickFox
2
custom dataset
#38 opened a year ago by lxa9867
0
training with conditioning t5
#34 opened 2 years ago by nikuson
5
Exploding loss
#35 opened 2 years ago by alexrodi
3
Pre-trained Weights of AutoEncoder
#27 opened 2 years ago by JustinYuu
1
Error Locating Target
#26 opened 2 years ago by ModeratePrawn
1
nan outputs when the number of sampling steps is set to 1
#15 opened 2 years ago by Kinyugo
3
Using the audio_975 model with colab fails
#21 opened 2 years ago by timohear
2
Question: Scaling guide/suggested parameters?
#5 opened 2 years ago by zaptrem
10
Add trainer
#3 opened 2 years ago by nateraw
4
text conditioned infinite ASMR generator?
#2 opened 2 years ago by lucidrains
4