/Broughton_digital

Repository for my project "Creating a workflow to turn the printed Broughton (1952) into a graph database".

Primary LanguageHTMLMIT LicenseMIT

Broughton_digital

Repository for my project "Creating a workflow to turn the printed Broughton (1952) into a graph database".

1. Preprocessing

2. OCR - Tesseract (gImageReader)

https://github.com/manisandro/gImageReader

Accuracy: 95,78% (168 of 3.984 Token needed correcting)

3. Processing