This repository contains code for running a machine learning model server using Docker and Python. The server serves predictions based on a trained logistic regression model. It also includes tests, a client server for manual testing, and additional files needed at runtime.
-
To run the server with the machine learning model, follow these steps:
- Build the Docker image using the provided Dockerfile
docker build -t ml-model-server .
- Run the Docker container to start the server on port 8080.
docker run -p 8080:8080 ml-model-server
- Build the Docker image using the provided Dockerfile
-
The tests are located in the
tests
folder and are executed via the CI/CD pipeline set up in thetest.yml
file located in theworkflows
folder (using GitHub Actions). To add new tests, create a new file in thetests
folder and add the relevant test cases. Currently, the filetests/test_server.py
is created for testing the server. -
To train a new model, create a folder named
data
and add a JSON file (data_1.json
) with the required data. (Note: The file format may be updated in the future.) -
The
client/client.py
file contains a client server that can be used for manual testing of the server. Modify theurl
,emotion
, andfeedback
variables to match your server's endpoint and the desired input values. -
The
models
folder contains additional information and files needed at runtime. Currently, the following files are included:
logreg_model.joblib
: A logistic regression model trained with the data.col_names.pk1
: Column names of the original dataset.
- The
src/model_dev.py
file demonstrates the model development process. It includes the following steps:
- Load the data using the
data_etl()
function from thesrc/util.py
file. - Convert categorical columns to one-hot encoded columns.
- Separate the input features and target variable.
- Create and train a logistic regression model.
- Predict the activity for a given emotion and feedback.
- Save the trained model and the column names for future use.
- Utility functions are stored in the
src/util.py
file. These functions include:
categorical_to_numeric_df()
: Converts categorical values to a numeric range.visualize()
: Visualizes the spread of user interactions.data_etl()
: Loads and preprocesses the data.
- The
src/model_evaluation.py
file is under development and contains code to evaluate the performance of different models.
Feel free to explore the repository and adapt it to your needs.