This project deploys a im2txt model in a docker container.
In order to build a inference server into the container, we use the following stack:
- [nginx][nginx] is a light-weight layer that handles the incoming HTTP requests and manages the I/O in and out of the container efficiently.
- [gunicorn][gunicorn] is a WSGI pre-forking worker server that runs multiple copies of your application and load balances between them.
- [flask][flask] is a simple web framework used in the inference app that you write. It lets you respond to call on the
/ping
and/invocations
endpoints without having to write much code.