Pinned Repositories
cor-asv-ann
OCR-D post-correction with encoder-attention-decoder LSTMs
nmalign
forced alignment of lists of string by fuzzy string matching
ocrd_detectron2
OCR-D wrapper for detectron2 based segmentation models
ocrd_publaynet
convert PubLayNet data into METS/PAGE-XML
page_dewarp
Text page dewarping using a "cubic sheet" model
workflow-configuration
a makefilization for OCR-D workflows, with configuration examples
ocrd_cis
OCR-D python tools
ocrd_all
Master repository which includes most other OCR-D repositories as submodules
ocrd_keraslm
Simple character-based language model using keras
ocrd_tesserocr
Run tesseract with the tesserocr bindings with @OCR-D's interfaces
bertsky's Repositories
bertsky/Mask_RCNN
Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow
bertsky/dta-lexdb-applications
formatting and integrating the Deutches Textarchiv dictionary into various applications
bertsky/ocrd-demo-2021-05-12
Demos for OCR-D presentation at OCR@vDHd
bertsky/ddb-metadata-schematron-validation
Schematron-Validierungen der Fachstelle Bibliothek der Deutsche Digitalen Bibliothek
bertsky/dh-datenkompetenz2024-ocr
Slides and materials for contribution to the Ringvorlesung DH in SS 24 at TUD
bertsky/dta-tools
Tools used in the project "Deutsches Textarchiv"
bertsky/gt-repo-scripts
XSLT and shell scripts for analyzing and creating GitHub pages of a ground truth repository. These are centrally managed and can be used by all repositories created with gt-repo-template (https://github.com/OCR-D/gt-repo-template).
bertsky/gt_structure_all
bertsky/htr-united
Ground Truth Resources for the HTR of patrimonial documents
bertsky/kraken
OCR engine for all the languages
bertsky/mkn-test-gt
meine DHd24-GT-Erfahrung
bertsky/mygt
mydesc
bertsky/ocrd_keraslm
Simple character-based language model using keras
bertsky/ocrd_manager
bertsky/ocrd_monitor
Web frontend for ocrd_manager
bertsky/page2tsv
PAGE-XML to TSV
bertsky/sbb_images
Annotation Tool and Image Search
bertsky/sbb_knowledge-base
Wikidata + Wikipedia Knowledge-Base Extraction for EL-purposes
bertsky/sbb_ned
Named Entity Disambiguation and Linking
bertsky/sbb_ner
Named Entity Recognition
bertsky/sbb_ocr_postcorrection
Two-Step Approach to OCR Post-Correction
bertsky/sbb_textline_detection
Detect textlines in document images
bertsky/sbb_tools
Digitalized Collections of the Berlin State Library: ALTO-XML Processing Tools / batch NER + EL / BERT-pre-training
bertsky/sbb_topic-modelling
Topic Modelling
bertsky/sbb_utils
shared functionality
bertsky/sbb_web-integration
Visualization of NER+EL+Topic Modelling + Image-Search
bertsky/tessdoc
Tesseract documentation
bertsky/tesseract
Tesseract Open Source OCR Engine (main repository)
bertsky/tesserocr
A Python wrapper for the tesseract-ocr API
bertsky/tesstrain
Train Tesseract LSTM with make