document-understanding
There are 43 repositories under document-understanding topic.
infiniflow/ragflow
RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMs
deepdoctection/deepdoctection
A Repo For Document AI
X-PLUG/mPLUG-DocOwl
mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding
AlibabaResearch/AdvancedLiterateMachinery
A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.
tstanislawek/awesome-document-understanding
A curated list of resources for Document Understanding (DU) topic
OpenBMB/VisRAG
Parsing-free RAG supported by VLMs
wenwenyu/PICK-pytorch
Code for the paper "PICK: Processing Key Information Extraction from Documents using Improved Graph Learning-Convolutional Networks" (ICPR 2020)
jpWang/LiLT
Official PyTorch implementation of LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding (ACL 2022)
GoogleCloudPlatform/document-ai-samples
Sample applications and demos for Document AI, the end-to-end document processing platform on Google Cloud
MathamPollard/awesome-table-structure-recognition
A Curated List of Awesome Table Structure Recognition (TSR) Research. Including models, papers, datasets and codes. Continuously updating.
SCUT-DLVCLab/Document-AI-Recommendations
Algorithms, papers, datasets, performance comparisons for Document AI. Continuously updating.
huggingface/chug
Minimal sharded dataset loaders, decoders, and utils for multi-modal document, image, and text datasets.
Alpha-Innovator/DocGenome
DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Models
andreagemelli/doc2graph
Doc2Graph transforms documents into graphs and exploit a GNN to solve several tasks.
doc-analysis/ReadingBank
ReadingBank: A Benchmark Dataset for Reading Order Detection
LynnHaDo/Document-Layout-Analysis
Object Detection Model for Scanned Documents
LynnHaDo/Checkbox-Detection
Checkbox Detection Model for Scanned Documents
microsoft/CompHRDoc
Datasets and Evaluation Scripts for CompHRDoc
ZeningLin/PEneo
[MM'2024] PEneo, an effective algorithm for key-value pair extraction from form-like documents, designed for real-world applications.
NExTplusplus/TAT-DQA
TAT-DQA: Towards Complex Document Understanding By Discrete Reasoning
SCUT-DLVCLab/RFUND
[MM'2024] Official release of RFUND introduced in the MM'2024 paper "PEneo: Unifying Line Extraction, Line Grouping, and Entity Linking for End-to-end Document Pair Extraction"
uakarsh/TiLT-Implementation
Implementation of the paper: Going Full-TILT Boogie on Document Understanding with Text-Image-Layout Transformer.
docling-project/docling4j
Docling4j brings the functionalities of Docling in document understanding to Java® projects
jacobmarks/pytesseract-ocr-plugin
Run optical character recognition with PyTesseract from the FiftyOne App!
javier-marti-isasi/OCR-free-Document-Understanding-with-Donut-Transformer
This project tackles a real-world challenge of automating client document processing, with a focus on enhancing document classification, error detection, data extraction, and validation.
bwnyasse/dart-documentai-samples
A hands-on CLI tool sample showcasing the integration of Dart with Google Cloud's DocumentAI.
callbacked/smoldocling256M-webgpu
Document Understanding in the Browser!
dhorvay/document-understanding-ebook
(WIP) ✨ A comprehensive resource for understanding the world of software used in the Document Understanding field. 🧙✨
irgroup/labelstudio-to-fonduer
This small module connects Label Studio with Fonduer by creating a fonduer labeling function for gold labels from a label studio export. Documentation: https://irgroup.github.io/labelstudio-to-fonduer/
Haruhiyuki/yuque-rag
将语雀知识库接入大语言模型,实现基于 RAG(检索增强生成)的智能问答系统,支持FastAPI,兼容OpenAI API与本地Ollama模型。
ExtrieveTechnologies/QuickCapture_IOS
QuickCapture Mobile Scanning SDK Specially designed for native IOS
marcel-lamott/SlimDoc
Official implementation for "SlimDoc: Lightweight Distillation of Document Transformer Models," published in the International Journal on Document Analysis and Recognition (IJDAR), 2025
PAIR-Systems-Inc/little-dorrit-editor
Multimodal benchmark for evaluating handwritten editorial correction in printed text.
kariiimadelll/CV-Extractor-UiPath-Automation-Project
A UiPath bot that reads all CVs (PDF files) from a folder, extracts key candidate information, and writes the results into an Excel file for easy review and analysis.
phong-lt/LiGT_VQA
This repository includes the ReceiptVQA dataset and the Pytorch implementation of the LiGT method and other evaluated baselines.