outman-goutian

outman-goutian's Stars

SpeechColab/Leaderboard
SpeechIO Leaderboard: a large, robust, comprehensive, benchmarking platform for Automatic Speech Recognition.
Language:Python42960
Audio-WestlakeU/audiossl
A library built for easier audio self-supervised training, downstream tasks evaluation
Language:Python9810
modelscope/FunASR
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
Language:Python6.2k657
zui0711/Z-Lab
Z Lab数据实验室开源代码汇总
Language:Jupyter Notebook19784
wenet-e2e/speech-synthesis-paper
List of speech synthesis papers.
992120
ddlBoJack/Speech-Resources
语音方向实验室/公司/资源/实习等，欢迎推荐或自荐
49263
VikParuchuri/marker
Convert PDF to markdown quickly with high accuracy
Language:Python16.8k953
snakers4/silero-models
Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
Language:Jupyter Notebook4.9k307
GeneralNewsExtractor/GeneralNewsExtractor
新闻网页正文通用抽取器 Beta 版.
Language:Python3.6k527
NVIDIA/NeMo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Language:Python11.7k2.4k
FunAudioLLM/CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Language:Python5.2k536
FunAudioLLM/SenseVoice
Multilingual Voice Understanding Model
Language:Python2.8k265
bubbliiiing/Siamese-pytorch
这是一个孪生神经网络（Siamese network）的库，可进行图片的相似性比较。
Language:Python573127
lksshw/SRNet
A pytorch implementation of the SRNet architecture from the paper Editing text in the wild (Liang Wu et al.)
Language:C++15135
fupinglee/Calculate_Captcha
计算验证码生成器，用于训练使用
Language:Java106
MgArcher/Text_select_captcha
实现文字点选、选字、选择、点触验证码识别，基于pytorch训练
Language:Python1.3k407
Boris-code/feapder
🚀🚀🚀feapder is an easy to use, powerful crawler framework | feapder是一款上手简单，功能强大的Python爬虫框架。内置AirSpider、Spider、TaskSpider、BatchSpider四种爬虫解决不同场景的需求。且支持断点续爬、监控报警、浏览器渲染、海量数据去重等功能。更有功能强大的爬虫管理系统feaplat为其提供方便的部署及调度
Language:Python2.9k479
declare-lab/tango
A family of diffusion models for text-to-audio generation.
Language:Python99179
Jungjee/INTERSPEECH2023_T6
Advances in audio anti-spoofing and deepfake detection using graph neural networks and self-supervised learning
Language:Jupyter Notebook221
nii-yamagishilab/project-NN-Pytorch-scripts
see README
Language:Python32149
asvspoof-challenge/2021
ASVspoof 2021 Baseline Systems
Language:Python19775
Amey-Thakur/DEEPFAKE-AUDIO
An audio deepfake is when a “cloned” voice that is potentially indistinguishable from the real person’s is used to produce synthetic audio.
Language:Python5511
dessa-oss/fake-voice-detection
Using temporal convolution to detect Audio Deepfakes
Language:Python34786
SummerColdWind/NoPaddleOnnxPredictor
不依赖paddlepaddle的PaddleOCR转ONNX模型后的文字识别推理工具
Language:Python71
PaddlePaddle/Paddle2ONNX
ONNX Model Exporter for PaddlePaddle
Language:Python717166
MarkHershey/AudioDeepFakeDetection
SUTD 50.039 Deep Learning Course Project (2022 Spring)
Language:Python7117
Python3WebSpider/DeepLearningSlideCaptcha2
Deep LearningImage Captcha 2
Language:Python17068
QwenLM/Qwen
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
Language:Python13.6k1.1k
nickliqian/generate_click_captcha
生成类似点选验证码的图片。
Language:Python216
Sanster/text_renderer
Generate text images for training deep learning ocr model
Language:Python1.4k383