yangdongchao/Text-to-sound-Synthesis
The source code of our paper "Diffsound: discrete diffusion model for text-to-sound generation"
Python
Issues
- 0
- 0
ModuleNotFoundError: No module named 'clip'
#26 opened by oloooooo - 3
- 0
Pre-trained model on audiocaps
#25 opened by XinleiNIU - 1
Where is the pretrained model for vocoder?
#24 opened by zhouyong64 - 0
Please add a LICENSE file
#23 opened by rafaelvalle - 0
Unable to access the pre-trained models
#22 opened by soujanyaporia - 0
Add License
#21 opened by sahajsk21 - 0
Location of audiocaps database
#20 opened by jasonfilos - 1
Pre-trained model Download issue
#19 opened by hoyeongchoi - 1
Pretrained model
#18 opened by mahmoudalismail - 2
provide examples with [mask] token?
#16 opened by jzq2000 - 0
Can Diffsound do unconditional generation?
#17 opened by Allencheng97 - 2
在线期待一个像disco diffusion的colab版本
#9 opened by chen-da-pang - 4
Embedding shape issue
#14 opened by YoonjinXD - 1
ModuleNotFoundError: No module named 'vocoder'
#4 opened by dto - 2
Storage of models and more
#13 opened by apolinario - 2
Where to download the melception.pt to evaluate?
#12 opened by Darius-H - 1
About CC_pretrained model
#11 opened by yizhidamiaomiao - 1
How do we use BERT or CLIP features?
#10 opened by yizhidamiaomiao - 2
- 3
How to use the codebook with the size of 512?
#7 opened by jojonki - 2
Baidu link
#6 opened by barneyhill - 3
Missing libraries "ftfy" "regex" "einops"
#5 opened by dto - 3
KeyError: 'visual.layer1.0.conv1.weight'
#3 opened by dto - 3
ImportError: cannot import name 'rank_zero_warn' from 'pytorch_lightning.utilities.distributed'
#2 opened by dto - 4
Missing package "image_synthesis"
#1 opened by dto