Optimized distribution of packet for inference only
dantetemplar opened this issue · 3 comments
dantetemplar commented
I think it would be great to be able to quickly and easily add kraken to a project. But this is hampered by the installation of large dependency packages: torch, cuda stuff and others. Perhaps using ONNX would help a lot.
mittagessen commented
ONNX doesn't play well with variable shape tensors as are needed for our
text recognition models. Nor do any of the other graph
compilation/capture approaches for pytorch (torchscript, coreml, ...) so
there isn't currently a technically feasible way to run lightweight
deployments.
dantetemplar commented
ONNX doesn't play well with variable shape tensors as are needed for our text recognition models. Nor do any of the other graph compilation/capture approaches for pytorch (torchscript, coreml, ...) so there isn't currently a technically feasible way to run lightweight deployments.
Thanks for the answer, I understood regarding ONNX. Do you know any alternatives?
mittagessen commented
On 24/06/17 10:18AM, dantetemplar wrote:
Thanks for the answer, I understood regarding ONNX. Do you know any alternatives?
All of those deployment frameworks suffer from it to some extent as far
as I know. In theory you can often construct the models manually with
their primitives, e.g. in CoreML, but this is tedious and often they
behave differently from native pytorch layers or are missing layer types
so it is rarely a simple conversion. Compilers using introspection
usually blow up with variable size tensors.