Pinned Repositories
gt-repo-template
A template for creating a ground truth repo with the various functions and features: such as metadata creation, data analysis and presentation.
keyboardGT
Offer of different keyboards for transcription software (Aletheia, Transkribus, LAREX, QURATOR-neat, eScriptorium)
AletheiaTools
AletheiaTools is a collection of tools for transforming file formats (PAGE XML) and metadata formats (METS). It is a kind of Ground Truth Swiss Knife ;-)
choco-mufin
Tools for normalizing the use of some characters and checking file consistencies
digi-gt
Ground truth for the digitized historic collections of UB Mannheim
gt-fraktur
gt-guidelines
OCR-D guidelines for Ground Truth production
gt_corpus_benchmark
This repo provides a collection of ground truth data. The collection was compiled under different aspects (complexity of the layouts and use of the fonts). The individual data are also characterized by metadata. The metadata is based on the labeling scheme of OCR-D/PrimaLab.
page2page
This repository save the stylesheet and workaround for transforming the properitary PAGE XML file from Transkribus (https://transkribus.eu/Transkribus) into a PAGE XML valid format (https://www.primaresearch.org/schema/PAGE/gts/pagecontent/ newest version from 2019-07-16
page2tei
tboenig's Repositories
tboenig/page2page
This repository save the stylesheet and workaround for transforming the properitary PAGE XML file from Transkribus (https://transkribus.eu/Transkribus) into a PAGE XML valid format (https://www.primaresearch.org/schema/PAGE/gts/pagecontent/ newest version from 2019-07-16
tboenig/digi-gt
Ground truth for the digitized historic collections of UB Mannheim
tboenig/gt_corpus_benchmark
This repo provides a collection of ground truth data. The collection was compiled under different aspects (complexity of the layouts and use of the fonts). The individual data are also characterized by metadata. The metadata is based on the labeling scheme of OCR-D/PrimaLab.
tboenig/gt-guideline-examples
tboenig/gt_structure_test
tboenig/01_Ground-Truth_Tagebuecher_Edwin_Hennig
tboenig/16_ant_complex
This repository provides the Ground Truth data for the OCR-D Quiver back end. This data serves as a basis for benchmarking the performance and accuracy of different OCR-D workflows for different types of input data.
tboenig/16_ant_simple
This repository provides the Ground Truth data for the OCR-D Quiver back end. This data serves as a basis for benchmarking the performance and accuracy of different OCR-D workflows for different types of input data.
tboenig/16_frak_complex
This repository provides the Ground Truth data for the OCR-D Quiver back end. This data serves as a basis for benchmarking the performance and accuracy of different OCR-D workflows for different types of input data.
tboenig/16_frak_simple
This repository provides the Ground Truth data for the OCR-D Quiver back end. This data serves as a basis for benchmarking the performance and accuracy of different OCR-D workflows for different types of input data.
tboenig/17_fontmix_simple
This repository provides the Ground Truth data for the OCR-D Quiver back end. This data serves as a basis for benchmarking the performance and accuracy of different OCR-D workflows for different types of input data.
tboenig/17_frak_complex
This repository provides the Ground Truth data for the OCR-D Quiver back end. This data serves as a basis for benchmarking the performance and accuracy of different OCR-D workflows for different types of input data.
tboenig/17_frak_simple
This repository provides the Ground Truth data for the OCR-D Quiver back end. This data serves as a basis for benchmarking the performance and accuracy of different OCR-D workflows for different types of input data.
tboenig/18_ant_simple
This repository provides the Ground Truth data for the OCR-D Quiver back end. This data serves as a basis for benchmarking the performance and accuracy of different OCR-D workflows for different types of input data.
tboenig/18_fontmix_complex
This repository provides the Ground Truth data for the OCR-D Quiver back end. This data serves as a basis for benchmarking the performance and accuracy of different OCR-D workflows for different types of input data.
tboenig/18_frak_simple
This repository provides the Ground Truth data for the OCR-D Quiver back end. This data serves as a basis for benchmarking the performance and accuracy of different OCR-D workflows for different types of input data.
tboenig/19_ant_simple
This repository provides the Ground Truth data for the OCR-D Quiver back end. This data serves as a basis for benchmarking the performance and accuracy of different OCR-D workflows for different types of input data.
tboenig/AustrianNewspapers
NewsEye / READ OCR training dataset from Austrian Newspapers
tboenig/DTGT
Ground truth for theological publications
tboenig/fraktionsprotokolle_web
Sourcefiles of the edited meeting minutes for the XML-Edition »Fraktionen im Deutschen Bundestag 1949-2005« and it's website fraktionsprotokolle.de
tboenig/google-ocr-testbed
A repository containing scripts and output for Google OCR of sample files.
tboenig/GT-commentaries-layout
Ground truth data for Page Layout Analysis of Historical Classical Commentaries.
tboenig/iiif-producer
A CLI tool that generates IIIF Presentation 2.1 Manifests from METS/MODS
tboenig/labelme
Image Polygonal Annotation with Python (polygon, rectangle, circle, line, point and image-level flag annotation).
tboenig/ocrd-website
tboenig/page-xml-draw
A powerful CLI tool for visualization and encoding of PAGE-XML files
tboenig/pageattrlib
XSLT Library to work with PAGE XML custom attributes
tboenig/pyautogui
A cross-platform GUI automation Python module for human beings. Used to programmatically control the mouse & keyboard.
tboenig/python_essentials_2_lab_solutions
Python code created for Cisco Networking Academy's Python Essentials 2 course.
tboenig/stabi-berlin-gt
Ground truth for digitized publications of Staatsbibliothek zu Berlin