wangmengzhi

Pinned Repositories

30dayMakeOS
《30天自制操作系统》源码中文版。自己制作一个操作系统（OSASK）的过程
Language:C0 1 00
500lines
500 Lines or Less
Language:JavaScript0 1 00
acapellabot
Acapella Extraction with a ConvNet
Language:Python0 1 00
AI_Composer
AI Composer for Machine Learning for Hackers #2
Language:Python00
Alibaba-MIT-Speech
Alibaba speech technology
0 1 00
caffe_ocr
主流ocr算法研究实验性的项目，目前实现了CNN+BLSTM+CTC架构
Language:C++1 2 00
CAT
A CRF-based ASR Toolkit
Language:Shell1 2 00
Lightweight-Transducer
Official implementation of the Interspeech 2024 paper "Lightweight Transducer Based on Frame Level Criterion".
Language:Python7 3 02
VAD
Voice activity detection (VAD) toolkit including DNN, bDNN, LSTM and ACAM based VAD. We also provide our directly recorded dataset.
Language:MATLAB1 2 00
vadnet
Real-time Voice Activity Detection in Noisy Eniviroments using Deep Neural Networks
Language:Python10

wangmengzhi's Repositories

wangmengzhi/Lightweight-Transducer
Official implementation of the Interspeech 2024 paper "Lightweight Transducer Based on Frame Level Criterion".
Language:Python7 3 02
wangmengzhi/CAT
A CRF-based ASR Toolkit
Language:Shell1 2 00
wangmengzhi/VAD
Voice activity detection (VAD) toolkit including DNN, bDNN, LSTM and ACAM based VAD. We also provide our directly recorded dataset.
Language:MATLAB1 2 00
wangmengzhi/vadnet
Real-time Voice Activity Detection in Noisy Eniviroments using Deep Neural Networks
Language:Python10
wangmengzhi/30dayMakeOS
《30天自制操作系统》源码中文版。自己制作一个操作系统（OSASK）的过程
Language:C0 1 00
wangmengzhi/AlphaZero_Gomoku
An implementation of the AlphaZero algorithm for Gomoku (also called Gobang or Five in a Row)
Language:Python0 1 00
wangmengzhi/ConvBert
wangmengzhi/DaCiDian
DaCiDian is an open-sourced chinese mandarin lexicon for automatic speech recognition(ASR)
Language:Python
wangmengzhi/deep-learning
my own deep learing project
Language:HTML1 0
wangmengzhi/denoising_DIHARD18
Language:Python1 0
wangmengzhi/DWMBlurGlass
Add custom effect to global system title bar, support win10 and win11.
Language:C++0 0
wangmengzhi/eesen
The official repository of the Eesen project
Language:C++
wangmengzhi/espnet
End-to-End Speech Processing Toolkit
Language:Shell1 0
wangmengzhi/models-1
Model configurations
Language:Python1 0
wangmengzhi/my_ch_speech_recognition
使用python进行语音识别,基于深度学习的中文语音识别系统
Language:Python1 0
wangmengzhi/py_speech_seg
A toolkit to implement segmentation on speech based on BIC
Language:Python1 0
wangmengzhi/pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, speaker embedding
Language:Python1 0
wangmengzhi/pyAudioAnalysis
Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications
Language:Python
wangmengzhi/Qwen-VL
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
Language:Python
wangmengzhi/serving
A flexible, high-performance serving system for machine learning models
Language:C++
wangmengzhi/SpectralCluster
Python re-implementation of the spectral clustering algorithm in the paper "Speaker Diarization with LSTM"
Language:Python
wangmengzhi/tacotron
A TensorFlow implementation of Google's Tacotron speech synthesis with pre-trained model (unofficial)
Language:Python1 0
wangmengzhi/tensorpack
A Neural Net Training Interface on TensorFlow, with focus on speed + flexibility
wangmengzhi/torchscale
Foundation Architecture for (M)LLMs
Language:Python0 0
wangmengzhi/TranslucentFlyouts
Translucent effect for most of the win32 flyouts
Language:C++0 0
wangmengzhi/uis-rnn
This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.
Language:Python
wangmengzhi/voicefilter
Unofficial PyTorch implementation of Google AI's VoiceFilter system
Language:Python1 0
wangmengzhi/warp-ctc
Fast parallel CTC.
Language:Cuda1 0
wangmengzhi/warp-transducer
A fast parallel implementation of RNN Transducer.
wangmengzhi/zhihu
This repo contains the source code in my personal column (https://zhuanlan.zhihu.com/zhaoyeyu), implemented using Python 3.6. Including Natural Language Processing and Computer Vision projects, such as text generation, machine translation, deep convolution GAN and other actual combat code.