Pinned Repositories
Audiomer-PyTorch
A Convolutional Transformer for Keyword Spotting
bark
🔊 Text-Prompted Generative Audio Model
CosyVoice
LLM based TTS model, providing inference/training/deployment full-stack ability.
kaldi-python
Python wrappers for Kaldi data
keras-kaldi
Keras Interface for Kaldi ASR
KWS_Max-pooling_RHE
Mining effective negative training samples for keyword spotting (PyTorch)
RPN_KWS
Region proposal network based small-footprint keyword spotting (Pytorch)
RPN_KWS_OHEM
RPN KWS with online hard example mining algorithm
SenseVoice
Multilingual Voice Understanding Model
wekws
Production First and Production Ready End-to-End Keyword Spotting Toolkit
jingyonghou's Repositories
jingyonghou/RPN_KWS
Region proposal network based small-footprint keyword spotting (Pytorch)
jingyonghou/Audiomer-PyTorch
A Convolutional Transformer for Keyword Spotting
jingyonghou/bark
🔊 Text-Prompted Generative Audio Model
jingyonghou/CosyVoice
LLM based TTS model, providing inference/training/deployment full-stack ability.
jingyonghou/SenseVoice
Multilingual Voice Understanding Model
jingyonghou/ChatLaw
中文法律大模型
jingyonghou/chinese_speech_pretrain
chinese speech pretrained models
jingyonghou/ChineseLyrics
10W首中文歌词数据库
jingyonghou/e2e_lfmmi
E2E system with LF-MMI; word N-gram for Mandarin
jingyonghou/ego2022
JOINT EGO-NOISE SUPPRESSION AND KEYWORD SPOTTING ON SWEEPING ROBOTS
jingyonghou/espeak-ng
eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.
jingyonghou/fairseq
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
jingyonghou/FunASR
A Fundamental End-to-End Speech Recognition Toolkit
jingyonghou/hifi-gan
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
jingyonghou/k2
FSA/FST algorithms, differentiable, with PyTorch compatibility.
jingyonghou/minbpe
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
jingyonghou/NeMo
NeMo: a toolkit for conversational AI
jingyonghou/phonemizer
Simple text to phones converter for multiple languages
jingyonghou/Qwen-Audio
The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.
jingyonghou/Qwen2.5
Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.
jingyonghou/speechbrain
A PyTorch-based Speech Toolkit
jingyonghou/THE-2020-PERSONALIZED-VOICE-TRIGGER-CHALLENGE-BASELINE-SYSTEM
jingyonghou/TNN
TNN:由腾讯优图实验室打造,移动端高性能、轻量级推断框架,同时拥有跨平台、高性能、模型压缩、代码裁剪等众多突出优势。TNN框架在原有Rapidnet、ncnn框架的基础上进一步加强了移动端设备的支持以及性能优化,同时也借鉴了业界主流开源框架高性能和良好拓展性的优点。目前TNN已经在手Q、微视、P图等应用中落地,欢迎大家参与协同共建,促进TNN推断框架进一步完善。
jingyonghou/tvm
Open deep learning compiler stack for cpu, gpu and specialized accelerators
jingyonghou/wenet
Production First and Production Ready End-to-End Speech Recognition Toolkit
jingyonghou/wenet-kws
Production First and Production Ready End-to-End Keyword Spotting Toolkit
jingyonghou/wenet_trt8
jingyonghou/wetts
Production First and Production Ready End-to-End Text-to-Speech Toolkit
jingyonghou/whisper
Robust Speech Recognition via Large-Scale Weak Supervision
jingyonghou/whisper.cpp
Port of OpenAI's Whisper model in C/C++