jeremy110's Stars
fishaudio/Bert-VITS2
vits2 backbone with multilingual-bert
myshell-ai/MeloTTS
High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.
alxndrTL/mamba.py
A simple and efficient Mamba implementation in pure PyTorch and MLX.
johnma2006/mamba-minimal
Simple, minimal implementation of the Mamba SSM in one file of PyTorch.
egruttadauria98/SSpaVAlDo
datawhalechina/learn-nlp-with-transformers
we want to create a repo to illustrate usage of transformers in chinese
VikParuchuri/surya
OCR, layout analysis, reading order, table recognition in 90+ languages
OpenGVLab/OmniQuant
[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.
zju3dv/EasyVolcap
[SIGGRAPH Asia 2023 (Technical Communications)] EasyVolcap: Accelerating Neural Volumetric Video Research
NVlabs/FasterViT
[ICLR 2024] Official PyTorch implementation of FasterViT: Fast Vision Transformers with Hierarchical Attention
cmhungsteve/Awesome-Transformer-Attention
An ultimately comprehensive paper list of Vision Transformer/Attention, including papers, codes, and related websites
lucidrains/x-transformers
A concise but complete full-attention transformer with a set of promising experimental features from various papers
xmu-xiaoma666/External-Attention-pytorch
🍀 Pytorch implementation of various Attention Mechanisms, MLP, Re-parameter, Convolution, which is helpful to further understand papers.⭐⭐⭐
Xiaobin-Rong/gtcrn
The official implementation of GTCRN, an ultra-lite speech enhancement model.
csteinmetz1/auraloss
Collection of audio-focused loss functions in PyTorch
modelscope/3D-Speaker
A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization
BUTSpeechFIT/DiaPer
microsoft/generative-ai-for-beginners
21 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/
JaeBinCHA7/DEMUCS-for-Speech-Enhancement
We implemented the DEMUCS model for speech enhancement in the time-frequency domain, and additionally implemented HD-DEMUCS.
p0p4k/pflowtts_pytorch
Unofficial implementation of NVIDIA P-Flow TTS paper
mit-han-lab/streaming-llm
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
FL33TW00D/whisper-turbo
Cross-Platform, GPU Accelerated Whisper 🏎️
Audio-WestlakeU/FS-EEND
The official Pytorch implementation of "Frame-wise streaming end-to-end speaker diarization with non-autoregressive self-attention-based attractors". [ICASSP 2024]
jax-ml/jax
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
DongKeon/Awesome-Speaker-Diarization
Some comprehensive papers about speaker diarization
VoxBlink/ScriptsForVoxBlink
A repo containing download guidance and corresponding scripts of the VoxBlink dataset.
pyannote/pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
FrenchKrab/IS2023-powerset-diarization
Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.
521xueweihan/HelloGitHub
:octocat: 分享 GitHub 上有趣、入门级的开源项目。Share interesting, entry-level open source projects on GitHub.
mkunes/w2v2_audioFrameClassification
wav2vec2 audio classification for prosodic boundary detection and other tasks