/umaru

An OCR-system based on torch using the technique of LSTM/GRU-RNN, CTC and referred to the works of rnnlib and clstm.

Primary LanguageLuaOtherNOASSERTION

umaru

An OCR-system based on torch using the technique of LSTM/GRU-RNN, CTC and referred to the works of rnnlib and clstm.

Notice

This work is now completely UNSTABLE, EXPERIMENTAL and UNDER DEVELOPMENT.

Dependencies

Build

$ ./build.sh

Usage

General

  • You could modify the settings in the main.lua directly and execute th main.lua, the input format is clstm-like (.png and .gt.txt pair) and you should put all input file path in a text file.
  • or if you prefer to use a JSON-format configuration file, you could follow the example below, and run:
$ th main.lua -setting [setting file]

Run Folder

There would be a folder created in the experments folder for every experiment. You could check out the log, settings and saved models there.

Example Configuration File

descriptions for each option could be found in main.lua.

{
  "project_name": "uy_rbm_noised",
  "raw_input": false,
  "hidden_size": 200,
  "nthread": 3,
  "clamp_size": 1,
  "ctc_lua": false,
  "recurrent_unit": "gru",
  "test_every": 2000,
  "omp_threads": 1,
  "show_every": 10,
  "testing_list_file": "wwr.txt",
  "input_size": 48,
  "testing_ratio": 1,
  "max_param_norm": false,
  "training_list_file": "full-train.txt",
  "feature_size": 240,
  "momentum": 0.9,
  "dropout_rate": 0.5,
  "max_iter": 10000000000,
  "save_every": 10000,
  "learning_rate": 0.0001,
  "stride": 5,
  "gpu": false,
  "rbm_network_file": "rbm/wwr.rbm",
  "windows_size": 10
}

LICENSE

BSD 3-Clause License

References