tesseract-ocr/tessdata

size of eng.traineddata best/fast/...

jbarth-ubhd opened this issue · 1 comments

Im just wondering a bit that the current eng.traineddata in this repository is larger than that in tessdata_best:

4be3f51b55c0074d8c6b1ee5b5100f95  tessdata_best/eng.traineddata  [15400601]
57e0df3d84fed9fbf8c7a8e589f8f012  tessdata/eng.traineddata       [23466654]
d1be414fbb296b3ad777bfca655e194e  tessdata_fast/eng.traineddata  [ 4113088]

The models in tessdata include not only data for the LSTM recognizer, but also for the legacy OCR engine. Especially they include two dictionaries.