using pyrnn.gz in clstm
srika91 opened this issue · 9 comments
How to use the pyrnn.gz models created in ocropy for prediction in clstm, as clstm prediction seems faster than the ocropy's prediction?
I don't think it's possible, since pyrnn and clstm use different model definitions:
https://github.com/tmbdev/clstm/blob/master/clstm.proto
https://github.com/mittagessen/kraken/blob/master/proto/pyrnn.proto
Maybe there's a way to convert between the two, but I wouldn't know how :/
I think to remember @tmbdev mentions somewhere that one has to train the models for CLSTM again from the GT, i.e. they might not really be convertible.
Have not tried it but there is https://github.com/naptha/ocracy/blob/master/ocropy/pyrnn2clstm.py
That script converts to the old HDF5-based format, not the new Protobuf-based one, unfortunately :-/
I just had a look at two protobuf models from clstm and from kraken (the fraktur one, which was converted from a pyrnn model). It looks like the ocropy-model has more parameters/weights in the LSTM layers than the clstm-model: They share wci, wgi, wgf, wgo
, but the ocropus model has wip, wfp, wop
in addition.
I doubt that just putting the four matching weight matrices for each layer into a clstm protobuf file would work, since those weights were conditioned on different architectures, but I'd love to be proven wrong :-)
Also, iirc clstm uses a different line normalization algorithm than ocropus, i.e. for identical line images the two models were conditioned on different inputs, though I don't know how much the difference matters in practice.
It looks like the ocropy-model has more parameters/weights in the LSTM layers than the clstm-model: They share wci, wgi, wgf, wgo, but the ocropus model has wip, wfp, wop in addition.
In clstm the peephole optimization code was dropped.
#17 (comment)
In ocropy it's still present.
They are for all intents and purposes completely different networks because of the peephole connections (so not really convertible). The code linked above only reserializes pickled pyrnn into HDF5 or protobuf files as they are vastly smaller (~1000 times without compression), faster to parse, and not an inherent security risk. A HDF5 or pronn model is still not a CLSTM model but an ocropy one with some benefits.
The line normalization and preprocessing is the same for both types of models.
The line normalization and preprocessing is the same for both types of models.
From ocropy README.md
CLSTM vs OCRopy
....
Python and C++ models can not be interchanged, both because the save file formats are different and because the text line normalization is slightly different.
The line image normalization is identical, the text line normalization is not. Ocropy normalizes output to NFKC(/D?), clstm doesn't normalize output to any Unicode normalization form.
@jbaiter sorry to open old closed subjects, but i am currently working on kraken, especially this fraktur model, and i understand you worked on it too ? is it a dead end ? I'm trying to see if it does a better job than tesseract...
the output I get with kraken -i imagefilename.tif outputfilename.xml binarize segment ocr -a -m fraktur.pronn
on ubuntu python 2.7.15 looks like it's in the wrong format...
thanks in advance !