aFewThings

I'm a Ph.D. student in Electrical Engineering at Korea University and am interested in various ML/DL tasks.

Multimedia Information Lab.South Korea

aFewThings's Stars

lllyasviel/ControlNet
Let us control diffusion models!
Language:Python30.9k 219 5562.8k
facebookresearch/audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
Language:Python21.1k 210 3922.2k
microsoft/unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
Language:Python20.4k 306 1.4k2.6k
open-mmlab/Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
Language:Jupyter Notebook7.9k 76 223597
xiph/rnnoise
Recurrent neural network for audio noise reduction
Language:C4.2k 149 206909
Stability-AI/stable-audio-tools
Generative models for conditional audio generation
Language:Python2.8k 42 102263
lucidrains/vector-quantize-pytorch
Vector (and Scalar) Quantization, in Pytorch
Language:Python2.7k 30 133221
crowsonkb/k-diffusion
Karras et al. (2022) diffusion models for PyTorch
Language:Python2.4k 42 66382
LAION-AI/CLAP
Contrastive Language-Audio Pretraining
Language:Python1.5k 29 93143
NVlabs/edm
Elucidating the Design Space of Diffusion-Based Generative Models (EDM)
Language:Python1.4k 29 27149
Text-to-Audio/AudioLCM
PyTorch Implementation of AudioLCM (ACM-MM'24): a efficient and high-quality text-to-audio generation with latent consistency model.
Language:Python1.1k 138 9179
declare-lab/tango
A family of diffusion models for text-to-audio generation.
Language:Python1.1k 28 4993
ShihaoZhaoZSH/Uni-ControlNet
[NeurIPS 2023] Uni-ControlNet: All-in-One Control to Text-to-Image Diffusion Models
Language:Python614 13 2642
facebookresearch/AudioMAE
This repo hosts the code and models of "Masked Autoencoders that Listen".
Language:Python553 32 2946
sail-sg/MDT
Masked Diffusion Transformer is the SOTA for image synthesis. (ICCV 2023)
Language:Python530 17 5240
google-research/leaf-audio
LEAF is a learnable alternative to audio features such as mel-filterbanks, that can be initialized as an approximation of mel-filterbanks, and then be trained for the task at hand, while using a very small number of parameters.
Language:Python501 11 2352
ivcylc/qa-mdt
OpenMusic: SOTA Text-to-music (TTM) Generation
Language:Python498 9 1247
Anima-Lab/MaskDiT
Code for Fast Training of Diffusion Models with Masked Transformers
Language:Python382 13 1914
haoheliu/AudioLDM-training-finetuning
AudioLDM training, finetuning, evaluation and inference.
Language:Python218 17 3943
Stability-AI/stable-audio-metrics
Metrics for evaluating music and audio generative models – with a focus on long-form, full-band, and stereo generations.
Language:Python176 4 019
yukara-ikemiya/friendly-stable-audio-tools
Refactored / updated version of `stable-audio-tools` which is an open-source code for audio/music generative models originally by Stability AI.
Language:Python150 5 511
cwx-worst-one/EAT
[IJCAI 2024] EAT: Self-Supervised Pre-Training with Efficient Audio Transformer
Language:Python117 5 106
JishengBai/AudioSetCaps
A 6-million Audio-Caption Paired Dataset Built with a LLMs and ALMs-based Automatic Pipeline
Language:Python99 3 02
MorenoLaQuatra/audioset-download
This package aims at simplifying the download of the AudioSet dataset.
Language:Python43 2 213
ZjjConan/Multi-Modal-Adapter
The official pytorch implemention of our CVPR-2024 paper "MMA: Multi-Modal Adapter for Vision-Language Models".
Language:Python42 4 32
yuanzhi-zhu/mini_edm
Minimum implementation of EDM (Elucidating the Design Space of Diffusion-Based Generative Models) on cifar10 and mnist
Language:Python41 3 24
wsntxxn/TextToAudioGrounding
The dataset and baseline code for Text-to-Audio Grounding (TAG)
Language:Python38 3 31
zeyuxie29/PicoAudio
Language:Python362
zeyuxie29/AudioTime
Language:Python21 2 1
blingcho/VFLIP-esorics24
Language:Python121

aFewThings

aFewThings's Stars

lllyasviel/ControlNet

facebookresearch/audiocraft

microsoft/unilm

open-mmlab/Amphion

xiph/rnnoise

Stability-AI/stable-audio-tools

lucidrains/vector-quantize-pytorch

crowsonkb/k-diffusion

LAION-AI/CLAP

NVlabs/edm

Text-to-Audio/AudioLCM

declare-lab/tango

ShihaoZhaoZSH/Uni-ControlNet

facebookresearch/AudioMAE

sail-sg/MDT

google-research/leaf-audio

ivcylc/qa-mdt

Anima-Lab/MaskDiT

haoheliu/AudioLDM-training-finetuning

Stability-AI/stable-audio-metrics

yukara-ikemiya/friendly-stable-audio-tools

cwx-worst-one/EAT

JishengBai/AudioSetCaps

MorenoLaQuatra/audioset-download

ZjjConan/Multi-Modal-Adapter

yuanzhi-zhu/mini_edm

wsntxxn/TextToAudioGrounding

zeyuxie29/PicoAudio

zeyuxie29/AudioTime

blingcho/VFLIP-esorics24