aqzlpm11's Stars
HqWu-HITCS/Awesome-Chinese-LLM
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
BradyFU/Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
cwx-worst-one/EAT
[IJCAI 2024] EAT: Self-Supervised Pre-Training with Efficient Audio Transformer
ga642381/speech-trident
Awesome speech/audio LLMs, representation learning, and codec models
EmulationAI/awesome-large-audio-models
Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.
RoyChao19477/SEMamba
This is the official implementation of the SEMamba paper. (Accepted to IEEE SLT 2024)
mosaicml/streaming
A Data Streaming Library for Efficient Neural Network Training
databricks/dbrx
Code examples and resources for DBRX, a large language model developed by Databricks
facebookresearch/DiT
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
speechnovateur/languagecodec_tmp
Temporary anonymous version
RVC-Boss/GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
sh-lee-prml/HierSpeechpp
The official implementation of HierSpeech++
open-mmlab/Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
OlaWod/FreeVC
FreeVC: Towards High-Quality Text-Free One-Shot Voice Conversion
Azure/MS-AMP
Microsoft Automatic Mixed Precision Library
Vaibhavs10/ml-with-audio
HF's ML for Audio study group
descriptinc/descript-audio-codec
State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.
XingangPan/DragGAN
Official Code for DragGAN (SIGGRAPH 2023)
datawhalechina/llm-cookbook
面向开发者的 LLM 入门教程,吴恩达大模型系列课程中文版
archinetai/audio-ai-timeline
A timeline of the latest AI models for audio generation, starting in 2023!
FastGitORG/nginx-conf
⚙️ Nginx conf of FastGit, core part of fastgit web booster module
ehabets/RIR-Generator
Generating room impulse responses
nico-zck/zotero-scholar-citations
jaywalnut310/glow-tts
A Generative Flow for Text-to-Speech via Monotonic Alignment Search
maxielee/ctrl-space-ime
autohotkey script to easily toggle han/en mode in windows 10
sukumo28/vscode-audio-preview
VS Code extension that allows you to preview and play audio files.
PaperCutSoftware/pdfsearch
A full text search library for PDFs.
P3TERX/aria2.sh
Aria2 一键安装管理脚本 增强版
NVIDIA/tacotron2
Tacotron 2 - PyTorch implementation with faster-than-realtime inference
agarden/remove-pdf-watermark
Short script for removing watermarks from PDF files. Requires pdftk.