outman-goutian's Stars
SpeechColab/Leaderboard
SpeechIO Leaderboard: a large, robust, comprehensive, benchmarking platform for Automatic Speech Recognition.
Audio-WestlakeU/audiossl
A library built for easier audio self-supervised training, downstream tasks evaluation
modelscope/FunASR
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
zui0711/Z-Lab
Z Lab数据实验室开源代码汇总
wenet-e2e/speech-synthesis-paper
List of speech synthesis papers.
ddlBoJack/Speech-Resources
语音方向实验室/公司/资源/实习等,欢迎推荐或自荐
VikParuchuri/marker
Convert PDF to markdown quickly with high accuracy
snakers4/silero-models
Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
GeneralNewsExtractor/GeneralNewsExtractor
新闻网页正文通用抽取器 Beta 版.
NVIDIA/NeMo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
FunAudioLLM/CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
FunAudioLLM/SenseVoice
Multilingual Voice Understanding Model
bubbliiiing/Siamese-pytorch
这是一个孪生神经网络(Siamese network)的库,可进行图片的相似性比较。
lksshw/SRNet
A pytorch implementation of the SRNet architecture from the paper Editing text in the wild (Liang Wu et al.)
fupinglee/Calculate_Captcha
计算验证码生成器,用于训练使用
MgArcher/Text_select_captcha
实现文字点选、选字、选择、点触验证码识别,基于pytorch训练
Boris-code/feapder
🚀🚀🚀feapder is an easy to use, powerful crawler framework | feapder是一款上手简单,功能强大的Python爬虫框架。内置AirSpider、Spider、TaskSpider、BatchSpider四种爬虫解决不同场景的需求。且支持断点续爬、监控报警、浏览器渲染、海量数据去重等功能。更有功能强大的爬虫管理系统feaplat为其提供方便的部署及调度
declare-lab/tango
A family of diffusion models for text-to-audio generation.
Jungjee/INTERSPEECH2023_T6
Advances in audio anti-spoofing and deepfake detection using graph neural networks and self-supervised learning
nii-yamagishilab/project-NN-Pytorch-scripts
see README
asvspoof-challenge/2021
ASVspoof 2021 Baseline Systems
Amey-Thakur/DEEPFAKE-AUDIO
An audio deepfake is when a “cloned” voice that is potentially indistinguishable from the real person’s is used to produce synthetic audio.
dessa-oss/fake-voice-detection
Using temporal convolution to detect Audio Deepfakes
SummerColdWind/NoPaddleOnnxPredictor
不依赖paddlepaddle的PaddleOCR转ONNX模型后的文字识别推理工具
PaddlePaddle/Paddle2ONNX
ONNX Model Exporter for PaddlePaddle
MarkHershey/AudioDeepFakeDetection
SUTD 50.039 Deep Learning Course Project (2022 Spring)
Python3WebSpider/DeepLearningSlideCaptcha2
Deep LearningImage Captcha 2
QwenLM/Qwen
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
nickliqian/generate_click_captcha
生成类似点选验证码的图片。
Sanster/text_renderer
Generate text images for training deep learning ocr model