outman-goutian

outman-goutian's Stars

GreatV/DocTrPP
DocTr++ in PaddlePaddle
Language:Python404
sml2h3/ddddocr
带带弟弟通用验证码识别OCR pypi版
Language:Python9.7k1.7k
Audio-AGI/AudioSep
Official implementation of "Separate Anything You Describe"
Language:Python1.6k115
JishengBai/ICME2024ASC
baseline for IEEE ICME 2024 GC: Semi-supervised Acoustic Scene Classification under Domain Shift
Language:Python142
tesseract-ocr/tesseract
Tesseract Open Source OCR Engine (main repository)
Language:C++61.3k9.4k
HowieHwong/TrustLLM
[ICML 2024] TrustLLM: Trustworthiness in Large Language Models
Language:Python43840
kinggongzilla/DCASE2023_Task2
Language:Python173
coqui-ai/TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Language:Python34.3k4.2k
open-mmlab/mmselfsup
OpenMMLab Self-Supervised Learning Toolbox and Benchmark
Language:Python3.2k428
bitzhangcy/Deep-Learning-Based-Anomaly-Detection
22422
FlappyPeggy/DMAD
Official code for "Diversity-Measurable Anomaly Detection", CVPR 2023 by Wenrui Liu, Hong Chang, Bingpeng Ma, Shiguang Shan, Xilin Chen.
Language:Python615
amusi/CVPR2024-Papers-with-Code
CVPR 2024 论文和开源项目合集
17.9k2.6k
DonaldRR/SimpleNet
Language:Python41962
chineseocr/chineseocr
yolo3+ocr
Language:Python5.9k1.7k
lovemefan/campplus
A open-source toolkit for single and multi-modal speaker verification from modelscope and funasr with onnx
Language:Python7
XingangPan/DragGAN
Official Code for DragGAN (SIGGRAPH 2023)
Language:Python35.7k3.4k
yeyupiaoling/AudioClassification-Pytorch
The Pytorch implementation of sound classification supports EcapaTdnn, PANNS, TDNN, Res2Net, ResNetSE and other models, as well as a variety of preprocessing methods.
Language:Python38979
TaoRuijie/ECAPA-TDNN
Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)
Language:Python591112
fighting41love/zhvoice
Chinese voice corpus. 中文语音语料，语音更加清晰自然，包含8个开源数据集，3200个说话人，900小时语音，1300万字。
580114
open-mmlab/mmocr
OpenMMLab Text Detection, Recognition and Understanding Toolbox
Language:Python4.3k745
baudm/parseq
Scene Text Recognition with Permuted Autoregressive Sequence Models (ECCV 2022)
Language:Python565126
snakers4/silero-vad
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
Language:Python4.1k402
lllyasviel/ControlNet
Let us control diffusion models!
Language:Python29.9k2.7k
oh-my-ocr/text_renderer
Language:Python776162
lllyasviel/Fooocus
Focus on prompting and generating
Language:Python40.5k5.7k
double22a/speech_dataset
The dataset of Speech Recognition
38372
modelscope/3D-Speaker
A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization
Language:Python1.1k95
MhLiao/DB
A PyTorch implementation of "Real-time Scene Text Detection with Differentiable Binarization".
Language:Python2.1k478
wenet-e2e/wespeaker
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
Language:Python686116
TideDancer/interspeech21_emotion
Language:Python9419