duyuankai1992

Pinned Repositories

-redis-
0 1 00
2019-CCF-BDCI-OCR-MCZJ-OCR-IdentificationIDElement
2019CCF-BDCI大赛最佳创新探索奖获得者基于OCR身份证要素提取赛题冠军天晨破晓团队赛题源码
Language:Python0 1 00
3d-photo-inpainting
[CVPR 2020] 3D Photography using Context-aware Layered Depth Inpainting
Language:Python0 1 00
3D-Shape-Analysis-Paper-List
A list of recent papers, libraries and datasets about 3D shape/scene analysis (by topics, updating).
Language:Python0 1 00
Abstractive-Summarization-With-Transfer-Learning
Abstractive summarisation using Bert as encoder and Transformer Decoder
Language:Python0 1 00
acl2020-openqa-tutorial
ACL2020 Tutorial: Open-Domain Question Answering
0 1 00
AJAX
Language:Java0 1 00
Algorithm_Interview_Notes-Chinese
2018/2019/校招/春招/秋招/算法/机器学习(Machine Learning)/深度学习(Deep Learning)/自然语言处理(NLP)/C/C++/Python/面试笔记
Language:Python0 1 00
OpenDiablo2
An open source re-implementation of Diablo 2
Language:Go1 1 00
Pytorch-UNet
PyTorch implementation of the U-Net for image semantic segmentation with high quality images
Language:Python1 1 00

duyuankai1992's Repositories

duyuankai1992/ASL-Recognizer
Action recognition application using models trained on WLASL dataset to translate ASL to English.
Language:Jupyter Notebook0 0
duyuankai1992/clip_text_span
official implementation of "Interpreting CLIP's Image Representation via Text-Based Decomposition"
Language:Jupyter Notebook0 0
duyuankai1992/CogVideo
Text-to-video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
Language:Python0 0
duyuankai1992/DeepSeek-VL
DeepSeek-VL: Towards Real-World Vision-Language Understanding
Language:Python0 0
duyuankai1992/DesignEdit
Code for DesignEdit
Language:Python0 0
duyuankai1992/DiT-Visualization
Visualization of DiT self attention features
Language:Python0 0
duyuankai1992/free-programming-books-zh_CN
:books: 免费的计算机编程类中文书籍，欢迎投稿
0 0
duyuankai1992/grok-1
Grok open release
Language:Python0 0
duyuankai1992/ImageAnalysisService
轻量模型的图像分析web服务，包括倾斜矫正OCR，公章(印章)检测+识别，车牌识别。api方案使用FastAPI+Gunicorn，提供gradio展示。
Language:Python0 0
duyuankai1992/Latte
Latte: Latent Diffusion Transformer for Video Generation.
Language:Python0 0
duyuankai1992/Linly-Dubbing
智能视频多语言AI配音/翻译工具 - Linly-Dubbing — “AI赋能，语言无界”
Language:Jupyter Notebook0 0
duyuankai1992/Live
收集于互联网上的一些高清直播源。
0 0
duyuankai1992/LLMs-from-scratch
Implementing a ChatGPT-like LLM in PyTorch from scratch, step by step
Language:Jupyter Notebook0 0
duyuankai1992/LWM
Language:Python0 0
duyuankai1992/MagicDance
MagicDance: Realistic Human Dance Video Generation with Motions & Facial Expressions Transfer
Language:Python0 0
duyuankai1992/manga-image-translator
Translate manga/image 一键翻译各类图片内文字 https://cotrans.touhou.ai/
Language:Python0 0
duyuankai1992/MiniGemini
Official implementation for Mini-Gemini
Language:Python0 0
duyuankai1992/OCR_MLLM_TOY
A multimodal large language model for ocr. OCR_MLLM
Language:Python1 0
duyuankai1992/OCRmyPDF
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
Language:Python0 0
duyuankai1992/OOTDiffusion
Official implementation of OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on
Language:Python0 0
duyuankai1992/Paints-UNDO
Understand Human Behavior to Align True Needs
Language:Python0 0
duyuankai1992/Paper-Piano
Piano like no other, Piano on Paper
Language:Python0 0
duyuankai1992/PDF-Extract-Kit
A Comprehensive Toolkit for High-Quality PDF Content Extraction
Language:Python0 0
duyuankai1992/RS_Scene_ZSL
PyTorch code for Deep Semantic-Visual Alignment for zero-shot remote sensing image scene classification
Language:Python0 0
duyuankai1992/SFDA-FSM
[MIA' 22] Source free domain adaptation for medical image segmentation with fourier style mining
Language:Python0 0
duyuankai1992/ssvp_slt
Self-supervised video pretraining for sign language translation.
Language:Python0 0
duyuankai1992/stable-diffusion-webui
Stable Diffusion web UI
Language:Python0 0
duyuankai1992/UltraPixel
Implementation of UltraPixel: Advancing Ultra-High-Resolution Image Synthesis to New Peaks
Language:Python0 0
duyuankai1992/V-Express
V-Express aims to generate a talking head video under the control of a reference image, an audio, and a sequence of V-Kps images.
Language:Python0 0
duyuankai1992/video_features
Extract video features from raw videos using multiple GPUs. We support RAFT flow frames as well as S3D, I3D, R(2+1)D, VGGish, CLIP, and TIMM models.
Language:Python0 0