jingyonghou

Keyword spotting, Query-by-Example, Speech Recognition and Neural Network

Pinned Repositories

Audiomer-PyTorch
A Convolutional Transformer for Keyword Spotting
Language:Python3 1 07
bark
🔊 Text-Prompted Generative Audio Model
Language:Jupyter Notebook10
CosyVoice
LLM based TTS model, providing inference/training/deployment full-stack ability.
Language:Python1 0 00
kaldi-python
Python wrappers for Kaldi data
Language:C++1 2 01
keras-kaldi
Keras Interface for Kaldi ASR
Language:Python4 2 00
KWS_Max-pooling_RHE
Mining effective negative training samples for keyword spotting (PyTorch)
Language:Python58 3 19
RPN_KWS
Region proposal network based small-footprint keyword spotting (Pytorch)
Language:Python53 3 316
RPN_KWS_OHEM
RPN KWS with online hard example mining algorithm
1 2 00
SenseVoice
Multilingual Voice Understanding Model
Language:Python1 0 00
wekws
Production First and Production Ready End-to-End Keyword Spotting Toolkit
Language:Python485 15 76114

jingyonghou's Repositories

jingyonghou/RPN_KWS
Region proposal network based small-footprint keyword spotting (Pytorch)
Language:Python53 3 316
jingyonghou/Audiomer-PyTorch
A Convolutional Transformer for Keyword Spotting
Language:Python3 1 07
jingyonghou/bark
🔊 Text-Prompted Generative Audio Model
Language:Jupyter Notebook10
jingyonghou/CosyVoice
LLM based TTS model, providing inference/training/deployment full-stack ability.
Language:Python1 0 00
jingyonghou/SenseVoice
Multilingual Voice Understanding Model
Language:Python1 0 00
jingyonghou/ChatLaw
中文法律大模型
0 0 00
jingyonghou/chinese_speech_pretrain
chinese speech pretrained models
Language:Shell0 1 00
jingyonghou/ChineseLyrics
10W首中文歌词数据库
0 0
jingyonghou/e2e_lfmmi
E2E system with LF-MMI; word N-gram for Mandarin
Language:Python1 0
jingyonghou/ego2022
JOINT EGO-NOISE SUPPRESSION AND KEYWORD SPOTTING ON SWEEPING ROBOTS
Language:MATLAB1 0
jingyonghou/espeak-ng
eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.
Language:C0 0
jingyonghou/fairseq
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
Language:Python1 0
jingyonghou/FunASR
A Fundamental End-to-End Speech Recognition Toolkit
Language:Python1 0
jingyonghou/hifi-gan
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
Language:Python1 0
jingyonghou/k2
FSA/FST algorithms, differentiable, with PyTorch compatibility.
Language:Cuda1 0
jingyonghou/minbpe
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
Language:Python0 0
jingyonghou/NeMo
NeMo: a toolkit for conversational AI
Language:Jupyter Notebook1 0
jingyonghou/phonemizer
Simple text to phones converter for multiple languages
Language:Python0 0
jingyonghou/Qwen-Audio
The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.
Language:Python0 0
jingyonghou/Qwen2.5
Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.
jingyonghou/speechbrain
A PyTorch-based Speech Toolkit
Language:Python1 0
jingyonghou/THE-2020-PERSONALIZED-VOICE-TRIGGER-CHALLENGE-BASELINE-SYSTEM
Language:Shell1 01
jingyonghou/TNN
TNN：由腾讯优图实验室打造，移动端高性能、轻量级推断框架，同时拥有跨平台、高性能、模型压缩、代码裁剪等众多突出优势。TNN框架在原有Rapidnet、ncnn框架的基础上进一步加强了移动端设备的支持以及性能优化，同时也借鉴了业界主流开源框架高性能和良好拓展性的优点。目前TNN已经在手Q、微视、P图等应用中落地，欢迎大家参与协同共建，促进TNN推断框架进一步完善。
Language:C++1 01
jingyonghou/tvm
Open deep learning compiler stack for cpu, gpu and specialized accelerators
jingyonghou/wenet
Production First and Production Ready End-to-End Speech Recognition Toolkit
Language:Python1 0
jingyonghou/wenet-kws
Production First and Production Ready End-to-End Keyword Spotting Toolkit
Language:Python1 0
jingyonghou/wenet_trt8
Language:Python1 0
jingyonghou/wetts
Production First and Production Ready End-to-End Text-to-Speech Toolkit
Language:Python1 0
jingyonghou/whisper
Robust Speech Recognition via Large-Scale Weak Supervision
Language:Jupyter Notebook1 0
jingyonghou/whisper.cpp
Port of OpenAI's Whisper model in C/C++
Language:C1 0