iwaterxt

iwaterxt's Stars

ytdl-org/youtube-dl
Command-line program to download videos from YouTube.com and other video sites
Language:Python133k 2.2k 26.7k10.2k
LAION-AI/Open-Assistant
OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.
Language:Python37.2k 433 1.6k3.3k
ggerganov/whisper.cpp
Port of OpenAI's Whisper model in C/C++
Language:C++36.8k 319 1.4k3.8k
facebookresearch/audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
Language:Python21.3k 213 3982.2k
triton-lang/triton
Development repository for the Triton language and compiler
Language:C++13.9k 198 1.6k1.7k
ggerganov/ggml
Tensor library for machine learning
Language:C++11.5k 134 4311.1k
lucidrains/PaLM-rlhf-pytorch
Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM
Language:Python7.7k 143 48671
InternLM/InternLM
Official release of InternLM2.5 base and chat models. 1M context support
Language:Python6.6k 59 340466
google/automl
Google Brain AutoML
Language:Jupyter Notebook6.3k 150 8871.5k
deepjavalibrary/djl
An Engine-Agnostic Deep Learning Framework in Java
Language:Java4.2k 105 863674
collabora/WhisperSpeech
An Open Source text-to-speech system built by inverting Whisper.
Language:Jupyter Notebook4k 76 112221
k2-fsa/sherpa-onnx
Speech-to-text, text-to-speech, speaker diarization, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift, Dart, JavaScript, Flutter, Object Pascal, Lazarus, Rust
Language:C++4k 57 617473
NExT-GPT/NExT-GPT
Code and models for NExT-GPT: Any-to-Any Multimodal Large Language Model
Language:Python3.4k 60 108344
yerfor/GeneFace
GeneFace: Generalized and High-Fidelity 3D Talking Face Synthesis; ICLR 2023; Official code
Language:Python2.6k 52 281298
haoheliu/AudioLDM2
Text-to-Audio/Music Generation
Language:Python2.3k 44 72185
LCAV/pyroomacoustics
Pyroomacoustics is a package for audio signal processing for indoor applications. It was developed as a fast prototyping platform for beamforming algorithms in indoor scenarios.
Language:Python1.5k 41 239435
Voine/ChatWaifu_Mobile
移动版二次元 AI 老婆聊天器
Language:C++1.3k 21 22137
k2-fsa/k2
FSA/FST algorithms, differentiable, with PyTorch compatibility.
Language:Cuda1.1k 75 385217
sooftware/conformer
[Unofficial] PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)
Language:Python983 9 37179
SpeechColab/GigaSpeech
Large, modern dataset for speech recognition
Language:Shell654 19 6262
Zhendong-Wang/Diffusion-GAN
Official PyTorch implementation for paper: Diffusion-GAN: Training GANs with Diffusion
Language:Python636 17 4368
facebookresearch/voxpopuli
A large-scale multilingual speech corpus for representation learning, semi-supervised learning and interpretation
Language:Python518 18 2256
yl4579/StyleTTS
Official Implementation of StyleTTS
Language:Python407 32 7565
microsoft/AEC-Challenge
AEC Challenge
392 29 23130
fjiang9/NKF-AEC
Acoustic Echo Cancellation with Nerual Kalman Filtering
Language:HTML260 10 2463
yl4579/PL-BERT
Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictions
Language:Python222 14 4840
sarulab-speech/jtubespeech
Language:Python215 10 846
echocatzh/MTFAA-Net
Multi-Scale Temporal Frequency Convolutional Network With Axial Attention for Speech Enhancement
Language:Python198 7 1258
jctian98/e2e_lfmmi
E2E system with LF-MMI; word N-gram for Mandarin
Language:Python165 8 1745
rrbluke/NRES
Neural Residual Echo Suppressor
Language:Python38 1 315