How to run
- Download the archive with the model and extract its contents into
model
folder in the root of the repository - Run
docker compose up
command
The service consists of four containers - model-preparation, inference, database, api
Model preparation
This container handles model conversion to onnx format
Inference
This container deploys the newly created onnx model to the triton inference server.
This container only starts when the model-preparation
container successfully finishes its work
Database
This container hosts a PostgreSQL server for storing the user ids
API
This container exposes the /predict
endpoint to the outer world, so we can submit a POST request with a JSON that contains user_id
and text
fields.
This container only starts after inference
and database
are considered as healthy
Example:
curl -X POST http://localhost:5000/predict \
-H 'Content-Type: application/json' \
-d '{"user_id":1, "text": "57 year old man with pancreatitis, alcohol withdrawal, tachypnea"}'