Pinned Repositories
alto-tools
Python script for performing various operations on ALTO XML files
AST-PST-Tablerecognizer
Recognize and extract AST-PST-Tables
BackgroundSubtractor4OCR
BackgroundSubtractor4OCR
german-newspapers-ocr-model
This repository contains models for historical newspapers.
GTMake
Creating gitrepobased GT-Linepairs with ease
ocr-model-catalogue
This repository contains a collection of layout analysis and text recogntion models.
ocr-model-repo-template
A template for creating an ocr model repo with the various functions and features: such as metadata creation and presentation.
PagePlus
This script processes PAGE XML files, a format widely used in document layout analysis, to perform various operations like validating, repairing, extending, and modifying text regions and lines.
scrape-editorial-board
Scraping editorial board of journals
tesseractXplore
tesseractXplore a tesseract ease of use gui with full control
JKamlah's Repositories
JKamlah/german-newspapers-ocr-model
This repository contains models for historical newspapers.
JKamlah/ocr-model-metadata
Metadata tool for OCR Models
JKamlah/PagePlus
This script processes PAGE XML files, a format widely used in document layout analysis, to perform various operations like validating, repairing, extending, and modifying text regions and lines.
JKamlah/ocr-model-catalogue
This repository contains a collection of layout analysis and text recogntion models.
JKamlah/ocr-model-repo-template
A template for creating an ocr model repo with the various functions and features: such as metadata creation and presentation.
JKamlah/alto-tools
Python script for performing various operations on ALTO XML files
JKamlah/awesome-RDM
A curated list of awesome RDM resources for researchers and organisations
JKamlah/dfg-viewer
The DFG Viewer is a free web service for browsing digitized books from remote library repositories in a rich and dynamic environment.
JKamlah/digi-gt
Ground truth for the digitized historic collections of UB Mannheim
JKamlah/eScriptorium
eScriptorium (https://gitlab.com/scripta/escriptorium)
JKamlah/frat
Fast Rectangle Annotation Tool
JKamlah/german-print-ocr-model
This repository contains a generic model for historical and modern german prints.
JKamlah/GTReval
GTReval - Evaluates and Revalueate
JKamlah/historical-reports-2col-ocr-model
This repository contains a segmentation model for historical and modern reports and other prints with two column layout.
JKamlah/kitodo-presentation
Kitodo.Presentation Community Edition
JKamlah/kitodo-presentation-docker
Docker configuration for Kitodo.Presentation
JKamlah/kraken
OCR engine for all the languages
JKamlah/LTCDP-DigitalVolumes
This repository contains data about the volumes for the long-term company data portal.
JKamlah/MBI-KG
Documentation for the project "Maschinen-Industrie"
JKamlah/Multi-Type-TD-TSR
Extracting Tables from Document Images using a Multi-stage Pipeline for Table Detection and Table Structure Recognition:
JKamlah/ocr-model-repo-scripts
The scripts help to create a github page for a repository based on ocr-model-repo-template. The scripts are a modified version of scripts from https://github.com/tboenig/gt-repo-scripts
JKamlah/OCRmyPDF
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
JKamlah/PagePlus-transkribus-utils
A python package providing some utility functions for interacting with the Transkribus-API. Based on:
JKamlah/PAGETools
Small collection of PAGE XML related scripts used at the ZPD Würzburg
JKamlah/ReproResearch
ReproResearch: A ready-to-use repository for FAIR reproducible research (data) projects
JKamlah/tesseract
Tesseract Open Source OCR Engine (main repository)
JKamlah/tesserocr
A Python wrapper for the tesseract-ocr API
JKamlah/ubma-segmentation-ocr-model
This repository contains a segmentation model for historical and modern prints.
JKamlah/whisply
Transcribe, diarize, annotate and subtitle audio and video with Whisper ... fast!
JKamlah/zelda3