DogeFlow's Stars
jishengpeng/WavChat
A Survey of Spoken Dialogue Models (60 pages)
FunAudioLLM/CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
coreweave/tensorizer
Module, Model, and Tensor Serialization/Deserialization
ohmyzsh/ohmyzsh
🙃 A delightful community-driven (with 2,400+ contributors) framework for managing your zsh configuration. Includes 300+ optional plugins (rails, git, macOS, hub, docker, homebrew, node, php, python, etc), 140+ themes to spice up your morning, and an auto-update tool that makes it easy to keep up with the latest updates from the community.
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
antgroup/echomimic_v2
EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation
Project-MONAI/MONAI
AI Toolkit for Healthcare Imaging
Shubhamsaboo/awesome-llm-apps
Collection of awesome LLM apps with RAG using OpenAI, Anthropic, Gemini and opensource models.
allenai/open-instruct
huggingface/trl
Train transformer language models with reinforcement learning.
pytorch/torchtitan
A PyTorch native library for large model training
modelscope/ClearerVoice-Studio
An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.
asteroid-team/asteroid
The PyTorch-based audio source separation toolkit for researchers
dmlguq456/SepReformer
Official repository of SepReformer for speech separation
huggingface/smol-course
A course on aligning smol models.
JusperLee/Dual-Path-RNN-Pytorch
Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation implemented by Pytorch
JusperLee/Conv-TasNet
Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation Pytorch's Implement
resemble-ai/Resemblyzer
A python package to analyze and compare voices with deep learning
Tencent/HunyuanVideo
HunyuanVideo: A Systematic Framework For Large Video Generation Model
juanmc2005/diart
A python package to build AI-powered real-time audio applications
pyannote/pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
wenet-e2e/wespeaker
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
modelscope/3D-Speaker
A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization
MahmoudAshraf97/whisper-diarization
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
yeyupiaoling/VoiceprintRecognition-Pytorch
This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE, ERes2Net, CAM++, etc. It is not excluded that more models will be supported in the future. At the same time, this project also supports MelSpectrogram, Spectrogram data preprocessing methods
facebookresearch/demucs
Code for the paper Hybrid Spectrogram and Waveform Source Separation
TowerYsable/speech_enhancement_awesome
alibabasglab/MossFormer
This repo provides the processed samples of the manuscript "MossFormer: Pushing the Performance Limit of Monaural Speech Separation using Gated Single-head Transformer with Convolution-augmented Joint Self-Attentions", which was submitted to ICASSP 2023.
coder/code-server
VS Code in the browser
deepinsight/insightface
State-of-the-art 2D and 3D Face Analysis Project