layout-analysis

There are 54 repositories under layout-analysis topic.

  • opendatalab/MinerU

    Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.

    Language:Python48.3k1991.9k4k
  • bytedance/Dolphin

    The official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025.

    Language:Python7.7k59132634
  • Layout-Parser/layout-parser

    A Unified Toolkit for Deep Learning Based Document Image Analysis

    Language:Python5.6k73154515
  • breezedeus/Pix2Text

    An Open-Source Python3 tool with SMALL models for recognizing layouts, tables, math formulas (LaTeX), and text in images, converting them into Markdown format. A free alternative to Mathpix, empowering seamless conversion of visual content into text-based representations. 80+ languages are supported.

    Language:Jupyter Notebook2.6k18113239
  • UglyToad/PdfPig

    Read and extract text and other content from PDFs in C# (port of PDFBox)

    Language:C#2.3k49573290
  • yomitoku

    kotaro-kinoshita/yomitoku

    YomiTokuはAIを活用した日本語文書解析エンジンを提供するPythonパッケージです。 Yomitoku is an AI-powered document image analysis package designed specifically for the Japanese language.

    Language:Python1.1k51241
  • mittagessen/kraken

    OCR engine for all the languages

    Language:Python90530566152
  • BobLd/DocumentLayoutAnalysis

    Document Layout Analysis resources repos for development with PdfPig.

    Language:C#62532168
  • mindspore-lab/mindocr

    A toolbox of ocr models and algorithms based on MindSpore

    Language:Python2881311960
  • RapidAI/RapidLayout

    Analysis of Chinese and English layouts 中英文版面分析

    Language:Python25451817
  • RapidAI/RapidDocEx

    📝 针对文档类图像做内容提取,将文档类图像一比一输出到Word或者Txt中,便于进一步使用或处理。后续计划支持输入PDF/图像,输出对应json格式、Txt格式、Word格式和Markdown格式。

    Language:Python207758
  • ppaanngggg/yolo-doclaynet

    YOLO models trained by DocLayNet - power your Document Intelligent by Layout Analysis

    Language:Python1402519
  • andreagemelli/doc2graph

    Doc2Graph transforms documents into graphs and exploit a GNN to solve several tasks.

    Language:Jupyter Notebook13361824
  • xushengfeng/eSearch-OCR

    基于paddleOCR的nodejs库

    Language:TypeScript9851510
  • NormXU/Layout2Graph

    An official implementation of paper "Paragraph2Graph: A Language-independent GNN-based framework for layout analysis"

    Language:Python8111312
  • CycloneBoy/pdf_table

    A Unified Toolkit for Deep Learning-Based Table Extraction

    Language:Python52549
  • JPLeoRX/detectron2-publaynet

    Trained Detectron2 object detection models for document layout analysis based on PubLayNet dataset

    Language:Python50337
  • MaitySubhajit/SelfDocSeg

    [ICDAR 2023] SelfDocSeg: A self-supervised vision-based approach towards Document Segmentation (Oral)

    Language:Python42452
  • empressabyss/nordrassil

    A keyboard layout that provides an elegant and balanced typing experience by its use of a thumb-alpha, emphasis on middle fingers, deprioritisation of pinkies, and arcane keys.

  • dell-research-harvard/HJDataset

    A Large Dataset of Historical Japanese Documents with Complex Layouts

    Language:Jupyter Notebook34224
  • BobLd/PdfPigMLNetBlockClassifier

    Proof of concept of training a simple Region Classifier using PdfPig and ML.NET (LightGBM). The objective is to classify each text block in a pdf document page as either title, text, list, table and image.

    Language:C#28306
  • CaseDrive/publaynet-models

    Trained Detectron2 object detection models for document layout analysis based on PubLayNet dataset

    Language:Python27202
  • jiangnanboy/layout_analysis4j

    利用java-yolov8实现版面检测(Chinese layout detection),java-yolov8 is used to detect the layout of Chinese document images

    Language:Java26119
  • MBAigner/PDFSegmenter

    This library builds a graph-representation of the content of PDFs. The graph is then clustered, resulting page segments are classified and returned. Tables are retrieved formatted as a CSV.

    Language:Python22103
  • aidayang/MinerU-OneClick

    MinerU免安装部署一键启动整合包

  • VRI-UFPR/ocrd-gbn

    OCR-D compliant toolset for optical layout recognition on historical german-language documents published in Brazil

    Language:Python11200
  • pleb631/pdfLayoutDet

    pdfDet aims to simplify PDF layout detect tasks for users.

    Language:Python9101
  • qyhou/curated-document-layout-analysis

    A curated list of resources on Document Layout Analysis

  • calfa-co/rasam-dataset

    Open Dataset for the Recognition and Analysis of Scripts in Arabic Maghrebi (ICDAR 2021, CHR 2024)

  • yoshihikoueno/pdfminer-layout-scanner

    A more complete example of programming with PDFMiner, which continues where the default documentation stops

    Language:Python7104
  • docviz

    privateai-com/docviz

    Advanced document contents extraction with multiple output formats

    Language:Python6
  • VRI-UFPR/page-xml-draw

    A powerful CLI tool for visualization and encoding of PAGE-XML files

    Language:Python64122
  • os-climate/crrf-det

    A web application for PDF content and table extraction, featuring image-based visual layout analysis, indexed document search, batch processing and extraction result annotation.

    Language:C++5503
  • engkimo/bullseye

    BullsEye is a Japanese Document AI system for production‑grade OCR, layout analysis, table structure recognition, reading order estimation, and LLM‑powered understanding. It exposes Unified Doc JSON with CLI/REST APIs and integrates bullseye‑compatible providers (Apache‑2.0).

    Language:Python4
  • Magnet-AI/Quanta

    Advanced PDF layout analysis engine for extracting figures, tables, and structured content from complex engineering documents using computer vision and machine learning.

    Language:Python2
  • rithulkamesh/docproc

    Opinionated and Sophisticated Document Region Analyzer.

    Language:Python2190