Pinned Repositories
altusi
the arabic-latin translations unified study interface
ArabicSOS
Segmenter and Orthography Standardazier (SOS) for Classical Arabic (CA)
Bleualign
Machine-Translation-based sentence alignment tool for parallel text
calamari
OCR Engine based on OCRopy and Kraken
kraken
Kraken fork using pytorch and warp-ctc instead of clstm
latin-bert-huggingface
Tokenizer config files to integrate Latin BERT in 🤗 transformers
nashi
Some bits of javascript to transcribe scanned pages using PageXML
page2tei
Python snippets that might be useful for exporting transcribed pages from PAGE XML to TEI XML
pagedir2pagexml
Command line tool to integrate ocropus results and ground truth in PageXML files
pagexmllineseg
Some python functions to put text lines in LAREX PageXML files
andbue's Repositories
andbue/nashi
Some bits of javascript to transcribe scanned pages using PageXML
andbue/kraken
Kraken fork using pytorch and warp-ctc instead of clstm
andbue/pagedir2pagexml
Command line tool to integrate ocropus results and ground truth in PageXML files
andbue/latin-bert-huggingface
Tokenizer config files to integrate Latin BERT in 🤗 transformers
andbue/page2tei
Python snippets that might be useful for exporting transcribed pages from PAGE XML to TEI XML
andbue/pagexmllineseg
Some python functions to put text lines in LAREX PageXML files
andbue/altusi
the arabic-latin translations unified study interface
andbue/ArabicSOS
Segmenter and Orthography Standardazier (SOS) for Classical Arabic (CA)
andbue/Bleualign
Machine-Translation-based sentence alignment tool for parallel text
andbue/calamari
OCR Engine based on OCRopy and Kraken
andbue/calamari_demo
Instructional materials for the calamari OCR engine
andbue/cltk
The Classical Language Toolkit
andbue/csmtiser
A tool for text normalisation via character-level machine translation
andbue/HTR-models-es
Handwritten Text Recognition models for different historical collections
andbue/LAREX
A semi-automatic open-source tool for Layout Analysis and Region EXtraction on early printed books.
andbue/LAREXjs
JS port of the semi-automatic open-source tool for Layout Analysis and Region EXtraction on early printed books.
andbue/latinlp
Docker image for some Latin NLP tools
andbue/LEMLAT3
Morphological analyzer and lemmatizer for Latin.
andbue/morpheus
Morpheus parser
andbue/neuspell
NeuSpell: A Neural Spelling Correction Toolkit
andbue/norma
A tool for automatic spelling normalization
andbue/ors2bryton
Convert routes from openrouteservice for bryton devices
andbue/punctuation-restoration
Punctuation Restoration using Transformer Models for High-and Low-Resource Languages
andbue/pydelta
an experimental implementation of Burrow's delta in Python 3
andbue/vdhd-2021-05-05
Demos for OCR-D presentation at OCR@vDHd
andbue/vscode-xml
XML Tools for Visual Studio Code
andbue/wordsxml