outman-goutian's Stars
GreatV/DocTrPP
DocTr++ in PaddlePaddle
sml2h3/ddddocr
带带弟弟 通用验证码识别OCR pypi版
Audio-AGI/AudioSep
Official implementation of "Separate Anything You Describe"
JishengBai/ICME2024ASC
baseline for IEEE ICME 2024 GC: Semi-supervised Acoustic Scene Classification under Domain Shift
tesseract-ocr/tesseract
Tesseract Open Source OCR Engine (main repository)
HowieHwong/TrustLLM
[ICML 2024] TrustLLM: Trustworthiness in Large Language Models
kinggongzilla/DCASE2023_Task2
coqui-ai/TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
open-mmlab/mmselfsup
OpenMMLab Self-Supervised Learning Toolbox and Benchmark
bitzhangcy/Deep-Learning-Based-Anomaly-Detection
FlappyPeggy/DMAD
Official code for "Diversity-Measurable Anomaly Detection", CVPR 2023 by Wenrui Liu, Hong Chang, Bingpeng Ma, Shiguang Shan, Xilin Chen.
amusi/CVPR2024-Papers-with-Code
CVPR 2024 论文和开源项目合集
DonaldRR/SimpleNet
chineseocr/chineseocr
yolo3+ocr
lovemefan/campplus
A open-source toolkit for single and multi-modal speaker verification from modelscope and funasr with onnx
XingangPan/DragGAN
Official Code for DragGAN (SIGGRAPH 2023)
yeyupiaoling/AudioClassification-Pytorch
The Pytorch implementation of sound classification supports EcapaTdnn, PANNS, TDNN, Res2Net, ResNetSE and other models, as well as a variety of preprocessing methods.
TaoRuijie/ECAPA-TDNN
Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)
fighting41love/zhvoice
Chinese voice corpus. 中文语音语料,语音更加清晰自然,包含8个开源数据集,3200个说话人,900小时语音,1300万字。
open-mmlab/mmocr
OpenMMLab Text Detection, Recognition and Understanding Toolbox
baudm/parseq
Scene Text Recognition with Permuted Autoregressive Sequence Models (ECCV 2022)
snakers4/silero-vad
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
lllyasviel/ControlNet
Let us control diffusion models!
oh-my-ocr/text_renderer
lllyasviel/Fooocus
Focus on prompting and generating
double22a/speech_dataset
The dataset of Speech Recognition
modelscope/3D-Speaker
A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization
MhLiao/DB
A PyTorch implementation of "Real-time Scene Text Detection with Differentiable Binarization".
wenet-e2e/wespeaker
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
TideDancer/interspeech21_emotion