yu20103983

yu20103983's Stars

openai/CLIP
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
Language:Jupyter Notebook25k3.2k
google-research/text-to-text-transfer-transformer
Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"
Language:Python6.1k753
zoubohao/DenoisingDiffusionProbabilityModel-ddpm-
This may be the simplest implement of DDPM. You can directly run Main.py to train the UNet on CIFAR-10 dataset and see the amazing process of denoising.
Language:Python1.5k159
lucidrains/DALLE2-pytorch
Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch
Language:Python11.1k1.1k
borisdayma/dalle-mini
DALL·E Mini - Generate images from a text prompt
Language:Python14.8k1.2k
lucidrains/imagen-pytorch
Implementation of Imagen, Google's Text-to-Image Neural Network, in Pytorch
Language:Python8k760
THUDM/CogVideo
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
Language:Python7.8k731
lzhbrian/image-to-image-papers
🦓<->🦒 🌃<->🌆 A collection of image to image papers with code (constantly updating)
1.1k156
yuanxiaosc/DeepImage-an-Image-to-Image-technology
DeepNude's algorithm and general image generation theory and practice research, including pix2pix, CycleGAN, UGATIT, DCGAN, SinGAN, ALAE, mGANprior, StarGAN-v2 and VAE models (TensorFlow2 implementation). DeepNude的算法以及通用生成对抗网络（GAN,Generative Adversarial Network）图像生成的理论与实践研究。
Language:Python5.2k2k
Yutong-Zhou-cv/Awesome-Text-to-Image
(ෆ`꒳´ෆ) A Survey on Text-to-Image Generation/Synthesis.
2.1k187
CompVis/stable-diffusion
A latent text-to-image diffusion model
Language:Jupyter Notebook67.8k10.1k
facebookresearch/ConvNeXt
Code release for ConvNeXt model
Language:Python5.7k694
bethgelab/imagecorruptions
Python package to corrupt arbitrary images.
Language:Python40764
amusi/CVPR2024-Papers-with-Code
CVPR 2024 论文和开源项目合集
17.9k2.6k
amusi/ICCV2023-Papers-with-Code
ICCV 2023 论文和开源项目合集
2.5k250
szad670401/end-to-end-for-chinese-plate-recognition
多标签分类,端到端的中文车牌识别基于mxnet, End-to-End Chinese plate recognition base on mxnet
Language:Python1.1k522
dee1024/pytorch-captcha-recognition
基于CNN训练的一套 "端到端" 的验证码识别模型，使用深度学习+训练数据+大量计算力，纯数字识别率高达 99.99%，数字+字母识别率 96%
Language:Python1.1k314
SpikeKing/CRAFT-Re-reimplementation
CRAFT算法的训练
Language:Python91
Megvii-BaseDetection/YOLOX
YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with MegEngine, ONNX, TensorRT, ncnn, and OpenVINO supported. Documentation: https://yolox.readthedocs.io/
Language:Python9.4k2.2k
HCIILAB/Scene-Text-Detection
535130
mzolfaghari/ECO-efficient-video-understanding
Code and models of paper " ECO: Efficient Convolutional Network for Online Video Understanding", ECCV 2018
Language:Jupyter Notebook43696
wenet-e2e/wenet
Production First and Production Ready End-to-End Speech Recognition Toolkit
Language:Python4.1k1.1k
nobody132/masr
中文语音识别; Mandarin Automatic Speech Recognition;
Language:Python1.9k480
xxbb1234021/speech_recognition
中文语音识别
Language:Python796299
zzw922cn/Automatic_Speech_Recognition
End-to-end Automatic Speech Recognition for Madarian and English in Tensorflow
Language:Python2.8k538
SeanNaren/deepspeech.pytorch
Speech Recognition using DeepSpeech2.
Language:Python2.1k620
pannous/tensorflow-speech-recognition
🎙Speech recognition using the tensorflow deep learning framework, sequence-to-sequence neural networks
Language:Python2.2k639
nl8590687/ASRT_SpeechRecognition
A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统
Language:Python7.8k1.9k
Uberi/speech_recognition
Speech recognition module for Python, supporting several engines and APIs, online and offline.
Language:Python8.4k2.4k
open-mmlab/mmsegmentation
OpenMMLab Semantic Segmentation Toolbox and Benchmark.
Language:Python8k2.6k