rakeshvar/chamanti_ocr_theano

Parallelize scribing and training

Closed this issue · 1 comments

The two main calls in the loop are

x, y = scriber()
cst, pred, forward_probs = ntwk.trainer(x, y_blanked)

The scriber could spawn a process or a thread to generate the next sample and immediately return an x, y pair that was generated using a previous thread. It is worthwhile to try multithreading and multiprocessing and see which one is better.

Implemented this in parscribe.py. There was no improvement in performance. Optimizing the function get_word() helped a lot.