yu20103983's Stars
openai/CLIP
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
google-research/text-to-text-transfer-transformer
Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"
zoubohao/DenoisingDiffusionProbabilityModel-ddpm-
This may be the simplest implement of DDPM. You can directly run Main.py to train the UNet on CIFAR-10 dataset and see the amazing process of denoising.
lucidrains/DALLE2-pytorch
Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch
borisdayma/dalle-mini
DALL·E Mini - Generate images from a text prompt
lucidrains/imagen-pytorch
Implementation of Imagen, Google's Text-to-Image Neural Network, in Pytorch
THUDM/CogVideo
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
lzhbrian/image-to-image-papers
🦓<->🦒 🌃<->🌆 A collection of image to image papers with code (constantly updating)
yuanxiaosc/DeepImage-an-Image-to-Image-technology
DeepNude's algorithm and general image generation theory and practice research, including pix2pix, CycleGAN, UGATIT, DCGAN, SinGAN, ALAE, mGANprior, StarGAN-v2 and VAE models (TensorFlow2 implementation). DeepNude的算法以及通用生成对抗网络(GAN,Generative Adversarial Network)图像生成的理论与实践研究。
Yutong-Zhou-cv/Awesome-Text-to-Image
(ෆ`꒳´ෆ) A Survey on Text-to-Image Generation/Synthesis.
CompVis/stable-diffusion
A latent text-to-image diffusion model
facebookresearch/ConvNeXt
Code release for ConvNeXt model
bethgelab/imagecorruptions
Python package to corrupt arbitrary images.
amusi/CVPR2024-Papers-with-Code
CVPR 2024 论文和开源项目合集
amusi/ICCV2023-Papers-with-Code
ICCV 2023 论文和开源项目合集
szad670401/end-to-end-for-chinese-plate-recognition
多标签分类,端到端的中文车牌识别基于mxnet, End-to-End Chinese plate recognition base on mxnet
dee1024/pytorch-captcha-recognition
基于CNN训练的一套 "端到端" 的验证码识别模型,使用深度学习+训练数据+大量计算力,纯数字识别率高达 99.99%,数字+字母识别率 96%
SpikeKing/CRAFT-Re-reimplementation
CRAFT算法的训练
Megvii-BaseDetection/YOLOX
YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with MegEngine, ONNX, TensorRT, ncnn, and OpenVINO supported. Documentation: https://yolox.readthedocs.io/
HCIILAB/Scene-Text-Detection
mzolfaghari/ECO-efficient-video-understanding
Code and models of paper " ECO: Efficient Convolutional Network for Online Video Understanding", ECCV 2018
wenet-e2e/wenet
Production First and Production Ready End-to-End Speech Recognition Toolkit
nobody132/masr
中文语音识别; Mandarin Automatic Speech Recognition;
xxbb1234021/speech_recognition
中文语音识别
zzw922cn/Automatic_Speech_Recognition
End-to-end Automatic Speech Recognition for Madarian and English in Tensorflow
SeanNaren/deepspeech.pytorch
Speech Recognition using DeepSpeech2.
pannous/tensorflow-speech-recognition
🎙Speech recognition using the tensorflow deep learning framework, sequence-to-sequence neural networks
nl8590687/ASRT_SpeechRecognition
A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统
Uberi/speech_recognition
Speech recognition module for Python, supporting several engines and APIs, online and offline.
open-mmlab/mmsegmentation
OpenMMLab Semantic Segmentation Toolbox and Benchmark.