jeremy110's Stars
lifeiteng/OmniSenseVoice
Omni SenseVoice: High-Speed Speech Recognition with words timestamps 🗣️🎯
JusperLee/TIGER
SWivid/F5-TTS
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
nttcslab-sp/mamba-diarization
Official repository for Mamba-based Segmentation Model for Speaker Diarization
Human9000/nd-Mamba2-torch
Only implemented through torch: "bi - mamba2" , "vision- mamba2 -torch". support 1d/2d/3d/nd and support export by jit.script/onnx;
nttcslab-sp-admin/mamba-diarization
lucidrains/minGRU-pytorch
Implementation of the proposed minGRU in Pytorch
yl4579/StyleTTS-ZS
StyleTTS-ZS: Efficient High-Quality Zero-Shot Text-to-Speech Synthesis with Distilled Time-Varying Style Diffusion
FireRedTeam/FireRedTTS
An Open-Sourced LLM-empowered Foundation TTS System
tw93/Pake
🤱🏻 Turn any webpage into a desktop app with Rust. 🤱🏻 利用 Rust 轻松构建轻量级多端桌面应用
Ceelog/DictionaryByGPT4
一本 GPT4 生成的单词书📚,超过 8000 个单词分析,涵盖了词义、例句、词根词缀、变形、文化背景、记忆技巧和小故事
Lightning-AI/LitServe
Lightning-fast serving engine for any AI model of any size. Flexible. Easy. Enterprise-scale.
lovemefan/fsmn-vad
A enterprise-grade Voice Activity Detector from modelscope and funasr.
opendatalab/MinerU
A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具,支持PDF/网页/多格式电子书提取。
modelscope/FunASR
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
FunAudioLLM/CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
FunAudioLLM/SenseVoice
Multilingual Voice Understanding Model
hustvl/Vim
[ICML 2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
state-spaces/mamba
Mamba SSM architecture
dongrixinyu/jiojio
A convenient Chinese word segmentation tool 简便中文分词器
huggingface/diarizers
tango4j/llm_speaker_tagging
SLT 2024 Challenge: Post-ASR-Speaker-Tagging
xi-j/Mamba-TasNet
KindXiaoming/pykan
Kolmogorov Arnold Networks
JusperLee/SPMamba
edahelsinki/slisemap
SLISEMAP: Combining supervised dimensionality reduction with local explanations
huggingface/parler-tts
Inference and training library for high-quality TTS models.
karpathy/llm.c
LLM training in simple, raw C/CUDA
jasonppy/VoiceCraft
Zero-Shot Speech Editing and Text-to-Speech in the Wild
unslothai/unsloth
Finetune Llama 3.2, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory