hijupiter's Stars
NVlabs/SegFormer
Official PyTorch implementation of SegFormer
CSAILVision/ADE20K
ADE20K Dataset
facebookresearch/segment-anything
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
duxiangcheng/SAEN
Modeling Stroke Mask for End-to-End Text Erasing
rohitgandikota/erasing
Erasing Concepts from Diffusion Models
yeungchenwa/Recommendations-Diffusion-Text-Image
A paper collection of recent diffusion models for text-image generation tasks, e,g., visual text generation, font generation, text removal, text image super resolution, text editing, handwritten generation, scene text recognition and scene text detection.
yeungchenwa/OCR-SAM
Combining MMOCR with Segment Anything & Stable Diffusion. Automatically detect, recognize and segment text instances, with serval downstream tasks, e.g., Text Removal and Text Inpainting
CandleLabAI/TPFNet
rezazad68/BCDUnet_DIBCO
Documnet Image Binarization, DIBCO Challenges
ajgallego/document-image-binarization
A selectional auto-encoder approach for document image binarization
qurator-spk/eynollah
Document Layout Analysis
datawhalechina/leedl-tutorial
《李宏毅深度学习教程》(李宏毅老师推荐👍,苹果书🍎),PDF下载地址:https://github.com/datawhalechina/leedl-tutorial/releases
DA-southampton/TRM_tutorial
Transformer在CV和NLP领域的变体模型的从零解读:Transformer;VIT;Swin Transformer
RisabBiswas/T2T-BinFormer
SOTA Document Image Enhancement - T2T-BinFormer: Effective Document Image Enhancement Using tokens-to-token Transformer Network
dali92002/DocEnTR
DocEnTr: An end-to-end document image enhancement transformer - ICPR 2022
phamquiluan/jdeskew
ICIP 2022: Adaptive Radial Projection on Fourier Magnitude Spectrum for Document Image Skew Estimation
deepdoctection/deepdoctection
A Repo For Document AI
attendfov/chinese-layoutlm-v2
中文文档理解多模态语言模型,支持多模态文档信息抽取,文档embedding
facebookresearch/detectron2
Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
microsoft/unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
Helen-Cheung/Baidu-AI-Challenge-Scene-Text-Removal
NielsRogge/Transformers-Tutorials
This repository contains demos I made with the Transformers library by HuggingFace.
AlibabaResearch/AdvancedLiterateMachinery
A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.
Zasder3/train-CLIP
A PyTorch Lightning solution to training OpenAI's CLIP from scratch.
rmokady/CLIP_prefix_caption
Simple image captioning model
RapidAI/RapidLaTeXOCR
Formula recognition based on LaTeX-OCR and ONNXRuntime.
biswassanket/DocSegTr
A Bottom-Up Instance Segmentation Strategy for segmenting document instances using Transformers
FeiGeChuanShu/DocTr-ncnn
ncnn demo of (文档矫正)DocTr: Document Image Transformer for Geometric Unwarping and Illumination Correction
philschmid/document-ai-transformers
shabie/docformer
Implementation of DocFormer: End-to-End Transformer for Document Understanding, a multi-modal transformer based architecture for the task of Visual Document Understanding (VDU)