This directory contains a Flask web server that hosts the TensorFlow serving client. It receives REST requests for predictions which are serialized as JSON.
Can be one of production
or development
. Works in conjunction
with the environment variable FLASK_MODE
.
-
production
implies the use of structured JSON logs andINFO
for most loggers with the assumption that this will be fed to ELK. -
development
implies developer friendly line by line logging with verbosity set toDEBUG
.
Can be one of multiprocess
or multithreaded
.
-
multiprocess
andFLASK_ENV
set toproduction
implies we run gunicorn with a preconfigured number of worker processes with each worker process handling a request asynchronously with eventlet. -
multithreaded
andFLASK_ENV
set toproduction
implies we run a single worker process with multiple threads and the application performing production logging. -
multithreaded
andFLASK_ENV
set todevelopment
implies we run a single worker process with multiple threads and the application performing developer friendly logging.
The following command runs the Flask application in development mode: a single multi-threaded worker process where each thread synchronously serves a single prediction request.
FLASK_ENV=development ./tf_serving_flask_app/run.sh -s /tmp/models/inceptionv3.spec
If you enable debug support the server will reload itself on changes
to source code rooted under tf_serving_flask_app
.
FLASK_ENV=development FLASK_DEBUG=1 ./tf_serving_flask_app/run.sh -s /tmp/models/inceptionv3.spec
FLASK_ENV=development FLASK_PROFILE=1 ./tf_serving_flask_app/run.sh -s /tmp/models/inceptionv3.spec
The screenshot below gives the overhead of the top 30 frames of a prediction request.
Please note that FLASK_DEBUG
does not do anything while running in production mode.
FLASK_ENV=production ./tf_serving_flask_app/run.sh -s /tmp/models/inceptionv3.spec
FLASK_ENV=production FLASK_PROFILE=1 ./tf_serving_flask_app/run.sh -s /tmp/models/inceptionv3.spec
FLASK_ENV=production FLASK_MODE=multithreaded ./tf_serving_flask_app/run.sh -s /tmp/models/inceptionv3.spec
Please note that the docker build must be invoked from the root of the grace git repository with the build context encapsulating both the flask application and the source directory containing generated spec protocol buffer files.
docker build -t <dockerhub-user>/tf-serving-rest-client -f tf_serving_flask_app/Dockerfile .
Deploy after building to dockerhub with:
docker login
docker push <dockerhub-user>/tf-serving-rest-client