Pinned Repositories
aidatatang_200zh
Aidatatang_200zh is an open source Chinese Mandarin speech corpus released by DataTang Technology Co., Ltd (www.datatang.com).
anjos2015
关于自己的一些2015年的算法
cn2an
📦 快速转化「中文数字」和「阿拉伯数字」~ (最新特性:分数,日期、温度等转化)
deform-conv
Deformable Convolution in TensorFlow / Keras
focal-loss
Focal loss for Dense Object Detection
ICTCLAS
中科院分词java客户端;支持x64、x86的windows、linux平台。
MLAlgorithms
Minimal and clean examples of machine learning algorithms
whisper-plus
WhisperPlus: Advancing Speech-to-Text Processing 🚀
whaozl's Repositories
whaozl/cn2an
📦 快速转化「中文数字」和「阿拉伯数字」~ (最新特性:分数,日期、温度等转化)
whaozl/whisper-plus
WhisperPlus: Advancing Speech-to-Text Processing 🚀
whaozl/AhoCorasickDoubleArrayTrie
An extremely fast implementation of Aho Corasick algorithm based on Double Array Trie.
whaozl/ASR-decoder
it's ASR decoder and make graph project
whaozl/CapsWriter-Offline
CapsWriter 简陋但好用的离线版,一个 PC 端的语音输入工具
whaozl/commonvoice-th
Kaldi recipe to train commonvoice corpus in Thai language
whaozl/github1s
One second to read GitHub code with VS Code.
whaozl/icefall
whaozl/javacpp
The missing bridge between Java and native C++
whaozl/jiwer
Evaluate your speech-to-text system with similarity measures such as word error rate (WER)
whaozl/k2-v2.0-pre-branch-HLG
FSA/FST algorithms, differentiable, with PyTorch compatibility.
whaozl/kaldi-native-fbank
Kaldi-compatible online fbank extractor without external dependencies
whaozl/Leaderboard
SpeechIO Leaderboard: a large, robust, comprehensive, benchmarking platform for Automatic Speech Recognition.
whaozl/LLaSM
第一个支持中英文双语语音-文本多模态对话的开源可商用对话模型。便捷的语音输入将大幅改善以文本为输入的大模型的使用体验,同时避免了基于 ASR 解决方案的繁琐流程以及可能引入的错误。
whaozl/NPTEL2020-Indian-English-Speech-Dataset
NPTEL2020: Speech2Text dataset for Indian-English Accent
whaozl/pororo
PORORO: Platform Of neuRal mOdels for natuRal language prOcessing
whaozl/Python-Wrapper-for-World-Vocoder
A Python wrapper for the high-quality vocoder "World"
whaozl/Recorder
html5 js 录音 mp3 wav ogg webm amr 格式,支持pc和Android、iOS部分浏览器、Hybrid App(提供Android iOS App源码)、微信,提供ASR语音识别转文字 H5版语音通话聊天示例 DTMF编码解码
whaozl/riva-asrlib-decoder
Standalone implementation of the CUDA-accelerated WFST Decoder available in Riva
whaozl/sherpa
Speech-to-text server framework with next-gen Kaldi
whaozl/sherpa-onnx
Real-time speech recognition using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, x86_64 servers, websocket server/client, C/C++, Python, Kotlin
whaozl/silero-vad
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
whaozl/speech_dataset
The dataset of Speech Recognition
whaozl/SpeechT5
Unified-Modal Speech-Text Pre-Training for Spoken Language Processing
whaozl/TMSpeech
腾讯会议摸鱼工具
whaozl/transformers
🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.
whaozl/vosk-api
Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
whaozl/wenet_trt8
whaozl/whisper
Robust Speech Recognition via Large-Scale Weak Supervision
whaozl/Whisper-Finetune
微调Whisper语音识别模型,支持无时间戳数据训练,有时间戳数据训练、无语音数据训练。加速推理,支持Web部署、Windows桌面部署和Android部署