layout-analysis
There are 54 repositories under layout-analysis topic.
opendatalab/MinerU
Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.
bytedance/Dolphin
The official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025.
Layout-Parser/layout-parser
A Unified Toolkit for Deep Learning Based Document Image Analysis
breezedeus/Pix2Text
An Open-Source Python3 tool with SMALL models for recognizing layouts, tables, math formulas (LaTeX), and text in images, converting them into Markdown format. A free alternative to Mathpix, empowering seamless conversion of visual content into text-based representations. 80+ languages are supported.
UglyToad/PdfPig
Read and extract text and other content from PDFs in C# (port of PDFBox)
kotaro-kinoshita/yomitoku
YomiTokuはAIを活用した日本語文書解析エンジンを提供するPythonパッケージです。 Yomitoku is an AI-powered document image analysis package designed specifically for the Japanese language.
mittagessen/kraken
OCR engine for all the languages
BobLd/DocumentLayoutAnalysis
Document Layout Analysis resources repos for development with PdfPig.
mindspore-lab/mindocr
A toolbox of ocr models and algorithms based on MindSpore
RapidAI/RapidLayout
Analysis of Chinese and English layouts 中英文版面分析
RapidAI/RapidDocEx
📝 针对文档类图像做内容提取,将文档类图像一比一输出到Word或者Txt中,便于进一步使用或处理。后续计划支持输入PDF/图像,输出对应json格式、Txt格式、Word格式和Markdown格式。
ppaanngggg/yolo-doclaynet
YOLO models trained by DocLayNet - power your Document Intelligent by Layout Analysis
andreagemelli/doc2graph
Doc2Graph transforms documents into graphs and exploit a GNN to solve several tasks.
xushengfeng/eSearch-OCR
基于paddleOCR的nodejs库
NormXU/Layout2Graph
An official implementation of paper "Paragraph2Graph: A Language-independent GNN-based framework for layout analysis"
CycloneBoy/pdf_table
A Unified Toolkit for Deep Learning-Based Table Extraction
JPLeoRX/detectron2-publaynet
Trained Detectron2 object detection models for document layout analysis based on PubLayNet dataset
MaitySubhajit/SelfDocSeg
[ICDAR 2023] SelfDocSeg: A self-supervised vision-based approach towards Document Segmentation (Oral)
empressabyss/nordrassil
A keyboard layout that provides an elegant and balanced typing experience by its use of a thumb-alpha, emphasis on middle fingers, deprioritisation of pinkies, and arcane keys.
dell-research-harvard/HJDataset
A Large Dataset of Historical Japanese Documents with Complex Layouts
BobLd/PdfPigMLNetBlockClassifier
Proof of concept of training a simple Region Classifier using PdfPig and ML.NET (LightGBM). The objective is to classify each text block in a pdf document page as either title, text, list, table and image.
CaseDrive/publaynet-models
Trained Detectron2 object detection models for document layout analysis based on PubLayNet dataset
jiangnanboy/layout_analysis4j
利用java-yolov8实现版面检测(Chinese layout detection),java-yolov8 is used to detect the layout of Chinese document images
MBAigner/PDFSegmenter
This library builds a graph-representation of the content of PDFs. The graph is then clustered, resulting page segments are classified and returned. Tables are retrieved formatted as a CSV.
aidayang/MinerU-OneClick
MinerU免安装部署一键启动整合包
VRI-UFPR/ocrd-gbn
OCR-D compliant toolset for optical layout recognition on historical german-language documents published in Brazil
pleb631/pdfLayoutDet
pdfDet aims to simplify PDF layout detect tasks for users.
qyhou/curated-document-layout-analysis
A curated list of resources on Document Layout Analysis
calfa-co/rasam-dataset
Open Dataset for the Recognition and Analysis of Scripts in Arabic Maghrebi (ICDAR 2021, CHR 2024)
yoshihikoueno/pdfminer-layout-scanner
A more complete example of programming with PDFMiner, which continues where the default documentation stops
privateai-com/docviz
Advanced document contents extraction with multiple output formats
VRI-UFPR/page-xml-draw
A powerful CLI tool for visualization and encoding of PAGE-XML files
os-climate/crrf-det
A web application for PDF content and table extraction, featuring image-based visual layout analysis, indexed document search, batch processing and extraction result annotation.
engkimo/bullseye
BullsEye is a Japanese Document AI system for production‑grade OCR, layout analysis, table structure recognition, reading order estimation, and LLM‑powered understanding. It exposes Unified Doc JSON with CLI/REST APIs and integrates bullseye‑compatible providers (Apache‑2.0).
Magnet-AI/Quanta
Advanced PDF layout analysis engine for extracting figures, tables, and structured content from complex engineering documents using computer vision and machine learning.
rithulkamesh/docproc
Opinionated and Sophisticated Document Region Analyzer.