Hannieliao's Stars
krahets/hello-algo
《Hello 算法》:动画图解、一键运行的数据结构与算法教程。支持 Python, Java, C++, C, C#, JS, Go, Swift, Rust, Ruby, Kotlin, TS, Dart 代码。简体版和繁体版同步更新,English version ongoing
rasbt/LLMs-from-scratch
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
datawhalechina/self-llm
《开源大模型食用指南》基于Linux环境快速部署开源大模型,更适合**宝宝的部署教程
open-mmlab/Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
ashleve/lightning-hydra-template
PyTorch Lightning + Hydra. A very user-friendly template for ML experimentation. ⚡🔥⚡
andrewekhalel/MLQuestions
Machine Learning and Computer Vision Engineer - Technical Interview Questions
Stability-AI/stable-audio-tools
Generative models for conditional audio generation
awesome-mlss/awesome-mlss
🤖 Machine Learning Summer School deadlines
ChenHsing/Awesome-Video-Diffusion-Models
[CSUR] A Survey on Video Diffusion Models
THUDM/ImageReward
[NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences for Text-to-image Generation
Text-to-Audio/Make-An-Audio
PyTorch Implementation of Make-An-Audio (ICML'23) with a Text-to-Audio Generative Model
LAION-AI/audio-dataset
Audio Dataset for training CLAP and other models
yangdongchao/UniAudio
The Open Source Code of UniAudio
YingqingHe/Awesome-LLMs-meet-Multimodal-Generation
🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).
haoheliu/audioldm_eval
This toolbox aims to unify audio generation model evaluation for easier comparison.
haoheliu/AudioLDM-training-finetuning
AudioLDM training, finetuning, evaluation and inference.
zhenye234/CoMoSpeech
CoMoSpeech: One-Step Speech and Singing Voice Synthesis via Consistency Model
yk7333/d3po
[CVPR 2024] Code for the paper "Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model"
luosiallen/Diff-Foley
Diff-Foley: Synchronized Video-to-Audio Synthesis with Latent Diffusion Models
yzxing87/Seeing-and-Hearing
[CVPR 2024] Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion Latent Aligners
MontrealCorpusTools/mfa-models
Collection of pretrained models for the Montreal Forced Aligner
speedyseal/audiosetdl
Scripts for download AudioSet
VincentHancoder/REPARO
The official implementation of work "REPARO: Compositional 3D Assets Generation with Differentiable 3D Layout Alignment".
dlrudco/Fast-Audioset-Download
Download audioset data super fastly with youtube-dl, ffmpeg and python multiprocessing
MorenoLaQuatra/audiocaps-download
This package aims at simplifying the download of the AudioCaps dataset.
ExplainableML/ImageSelect
Code for the paper "If at First You Don't Succeed, Try, Try Again: Faithful Diffusion-based Text-to-Image Generation by Selection"
wei-zeng98/piano-a2s
End-to-end real-world polyphonic piano audio-to-score transcription with hierarchical decoding (IJCAI 2024)
Hannieliao/Baton
Official Repository of IJCAI 2024 Paper: "BATON: Aligning Text-to-Audio Model with Human Preference Feedback"
DragonLiu1995/CVPR-2024-Speech_Audio_Music-Papers
A curated collections of papers related to speech, audio and music in CVPR 2024.
yash1994/youtube-8m-videos-downloader
Download videos from YouTube-8M dataset for testing