Pinned Repositories
-redis-
2019-CCF-BDCI-OCR-MCZJ-OCR-IdentificationIDElement
2019CCF-BDCI大赛 最佳创新探索奖获得者 基于OCR身份证要素提取赛题冠军 天晨破晓团队 赛题源码
3d-photo-inpainting
[CVPR 2020] 3D Photography using Context-aware Layered Depth Inpainting
3D-Shape-Analysis-Paper-List
A list of recent papers, libraries and datasets about 3D shape/scene analysis (by topics, updating).
Abstractive-Summarization-With-Transfer-Learning
Abstractive summarisation using Bert as encoder and Transformer Decoder
acl2020-openqa-tutorial
ACL2020 Tutorial: Open-Domain Question Answering
AJAX
Algorithm_Interview_Notes-Chinese
2018/2019/校招/春招/秋招/算法/机器学习(Machine Learning)/深度学习(Deep Learning)/自然语言处理(NLP)/C/C++/Python/面试笔记
OpenDiablo2
An open source re-implementation of Diablo 2
Pytorch-UNet
PyTorch implementation of the U-Net for image semantic segmentation with high quality images
duyuankai1992's Repositories
duyuankai1992/ASL-Recognizer
Action recognition application using models trained on WLASL dataset to translate ASL to English.
duyuankai1992/BXC_VideoAnalyzer_v4
基于C++开发的视频行为分析系统v4系统,可以在不用考虑音视频开发,编解码开发,界面开发等情况下, 只需要训练自己的模型,开发自己的算法插件,就可以轻松实现出任何想要的视频行为检测,比如周界入侵,烟火检测,打架,斗殴,跌倒,人群聚集,电动车,垃圾箱,抽烟,攀爬,离岗睡岗,安全帽,充电桩,工作服, 疲劳检测,交通拥堵等等。
duyuankai1992/clip_text_span
official implementation of "Interpreting CLIP's Image Representation via Text-Based Decomposition"
duyuankai1992/DeepSeek-VL
DeepSeek-VL: Towards Real-World Vision-Language Understanding
duyuankai1992/DesignEdit
Code for DesignEdit
duyuankai1992/DiT-Visualization
Visualization of DiT self attention features
duyuankai1992/free-programming-books-zh_CN
:books: 免费的计算机编程类中文书籍,欢迎投稿
duyuankai1992/grok-1
Grok open release
duyuankai1992/hello-algo
《Hello 算法》:动画图解、一键运行的数据结构与算法教程,支持 Python, C++, Java, C#, Go, Swift, JS, TS, Dart, Rust, C, Zig 等语言。English edition ongoing
duyuankai1992/ImageAnalysisService
轻量模型的图像分析web服务,包括倾斜矫正OCR,公章(印章)检测+识别,车牌识别。api方案使用FastAPI+Gunicorn,提供gradio展示。
duyuankai1992/imagen-pytorch
Implementation of Imagen, Google's Text-to-Image Neural Network, in Pytorch
duyuankai1992/InstantID
InstantID : Zero-shot Identity-Preserving Generation in Seconds 🔥
duyuankai1992/IntelliQ
Advanced Multi-Turn QA System with LLM and Intent Recognition. 基于LLM意图识别、参数抽取结合slot词槽技术实现多轮问答、NL2API
duyuankai1992/LaTeX-OCR
pix2tex: Using a ViT to convert images of equations into LaTeX code.
duyuankai1992/Latte
Latte: Latent Diffusion Transformer for Video Generation.
duyuankai1992/Live
收集于互联网上的一些高清直播源。
duyuankai1992/LWM
duyuankai1992/MagicDance
MagicDance: Realistic Human Dance Video Generation with Motions & Facial Expressions Transfer
duyuankai1992/manga-image-translator
Translate manga/image 一键翻译各类图片内文字 https://cotrans.touhou.ai/
duyuankai1992/MiniGemini
Official implementation for Mini-Gemini
duyuankai1992/my-tv
我的电视 电视直播软件,安装即可使用
duyuankai1992/OCR_MLLM_TOY
A multimodal large language model for ocr. OCR_MLLM
duyuankai1992/OCRmyPDF
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
duyuankai1992/OOTDiffusion
Official implementation of OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on
duyuankai1992/Paper-Piano
Piano like no other, Piano on Paper
duyuankai1992/RS_Scene_ZSL
PyTorch code for Deep Semantic-Visual Alignment for zero-shot remote sensing image scene classification
duyuankai1992/sdxl-lightning-demo-app
A demo application using fal.realtime and the lightning fast SDXL API provided by fal
duyuankai1992/SFDA-FSM
[MIA' 22] Source free domain adaptation for medical image segmentation with fourier style mining
duyuankai1992/stable-diffusion-webui
Stable Diffusion web UI
duyuankai1992/surya
Accurate line-level text detection and recognition (OCR) in any language