bertsky/ocrd_detectron2

make more efficient

Opened this issue · 1 comments

We currently only use Detectron2's DefaultPredictor for inference:

self.predictor = DefaultPredictor(cfg)

But the documentation says:

This is meant for simple demo purposes, so it does the above steps automatically. This is not meant for benchmarks or running complicated inference logic. If you’d like to do anything more complicated, please refer to its source code as examples to build and use the model manually

One can clearly see how the GPU utilization is scarce, so a multi-threaded implementation with data pipelining would boost performance a lot.

The first try in predict-async does not actually reduce wall time (it only reduces CPU seconds a bit). Perhaps we must first disentangle the page loop (make it a pipeline).

However, 88617a2 (i.e. predicting and post-processing at lower pixel density – no more than 150 DPI) does help quite a bit already.