weiuniverse's Stars
hacksider/Deep-Live-Cam
real time face swap and one-click video deepfake with only a single image
karpathy/LLM101n
LLM101n: Let's build a Storyteller
mem0ai/mem0
The Memory layer for AI Agents
black-forest-labs/flux
Official inference repo for FLUX.1 models
KwaiVGI/LivePortrait
Bring portraits to life!
m-bain/whisperX
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
goldmansachs/gs-quant
Python toolkit for quantitative finance
anthropics/anthropic-quickstarts
A collection of projects designed to help developers quickly get started with building deployable applications using the Anthropic API
lllyasviel/Omost
Your image is almost there!
andrewyng/translation-agent
Ceelog/DictionaryByGPT4
一本 GPT4 生成的单词书📚,超过 8000 个单词分析,涵盖了词义、例句、词根词缀、变形、文化背景、记忆技巧和小故事
kimiyoung/transformer-xl
s3prl/s3prl
Self-Supervised Speech Pre-training and Representation Learning Toolkit
xdit-project/xDiT
xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism
asteroid-team/torch-audiomentations
Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.
lucidrains/transfusion-pytorch
Pytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-Modal Model", from MetaAI
yeyupiaoling/VoiceprintRecognition-Pytorch
This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE, ERes2Net, CAM++, etc. It is not excluded that more models will be supported in the future. At the same time, this project also supports MelSpectrogram, Spectrogram data preprocessing methods
lucidrains/rotary-embedding-torch
Implementation of Rotary Embeddings, from the Roformer paper, in Pytorch
liusongxiang/StarGAN-Voice-Conversion
This is a pytorch implementation of the paper: StarGAN-VC: Non-parallel many-to-many voice conversion with star generative adversarial networks
DavidDiazGuerra/gpuRIR
Python library for Room Impulse Response (RIR) simulation with GPU acceleration
facebookresearch/libri-light
dataset for lightly supervised training using the librivox audio book recordings. https://librivox.org/.
schmiph2/pysepm
Python implementation of performance metrics in Loizou's Speech Enhancement book
winninghealth/WiNGPT2
WiNGPT是一个基于GPT的医疗垂直领域大模型,旨在将专业的医学知识、医疗信息、数据融会贯通,为医疗行业提供智能化的医疗问答、诊断支持和医学知识等信息服务,提高诊疗效率和医疗服务质量。
yluo42/TAC
transform-average-concatenate (TAC) method for end-to-end microphone permutation and number invariant ad-hoc beamforming.
chibui191/bitcoin_volatility_forecasting
GARCH and Multivariate LSTM forecasting models for Bitcoin realized volatility with potential applications in crypto options trading, hedging, portfolio management, and risk management
wenet-e2e/wesep
Target Speaker Extraction Toolkit
KunZhou9646/Emovox
This is the implementation of the paper "Emotion Intensity and its Control for Emotional Voice Conversion".
wotouteng/fens.me
chentuochao/Target-Conversation-Extraction
This is the code and dataset repo for Interspeech 2024 paper "Target conversation extraction: Source separation using turn-taking dynamics"
Alec-Wright/OpenAmp