ignite720's Stars
openai/whisper
Robust Speech Recognition via Large-Scale Weak Supervision
CorentinJ/Real-Time-Voice-Cloning
Clone a voice in 5 seconds to generate arbitrary speech in real-time
gradio-app/gradio
Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!
TabbyML/tabby
Self-hosted AI coding assistant
QwenLM/Qwen
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
microsoft/onnxruntime
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
kaldi-asr/kaldi
kaldi-asr/kaldi is the official location of the Kaldi project.
m-bain/whisperX
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
huggingface/transformers.js
State-of-the-art Machine Learning for the web. Run 🤗 Transformers directly in your browser, with no need for a server!
QwenLM/Qwen2.5
Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.
SubtitleEdit/subtitleedit
the subtitle editor :)
speechbrain/speechbrain
A PyTorch-based Speech Toolkit
modelscope/FunASR
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
Morizeyao/GPT2-Chinese
Chinese version of GPT2 training code, using BERT tokenizer.
librosa/librosa
Python library for audio and music analysis
pyannote/pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
tyiannak/pyAudioAnalysis
Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications
luau-lang/luau
A fast, small, safe, gradually typed embeddable scripting language derived from Lua
FunAudioLLM/SenseVoice
Multilingual Voice Understanding Model
LlamaEdge/LlamaEdge
The easiest & fastest way to run customized and fine-tuned LLMs locally or on the edge
segment-any-text/wtpsplit
Toolkit to segment text into sentences or other semantic units in a robust, efficient and adaptable way.
KoljaB/LocalAIVoiceChat
Local AI talk with a custom voice based on Zephyr 7B model. Uses RealtimeSTT with faster_whisper for transcription and RealtimeTTS with Coqui XTTS for synthesis.
Xirider/finetune-gpt2xl
Guide: Finetune GPT2-XL (1.5 Billion Parameters) and finetune GPT-NEO (2.7 B) on a single GPU with Huggingface Transformers using DeepSpeed
oliverguhr/wav2vec2-live
A live speech recognition using Facebooks wav2vec 2.0 model.
Perlmint/glew-cmake
GLEW(https://github.com/nigels-com/glew, source updated nightly) with Cmake and pre-generated sources
eastonYi/wav2vec
a simplified version of wav2vec(1.0, vq, 2.0) in fairseq
bmx-ng/bmx-ng
The Open Source BlitzMax Compiler Project
MatijaNovosel/montage
🎬 A clip editor made with Tauri.
Recordscript/recordscript
Cross-platform screen recorder, transcript, subtitle. Built with Tauri & Whisper-rs (rust port of whisper.cpp)
drakang4/jamak
A subtitle editor built with Electron, React and Redux.