README

Table of Contents

Description

This repository host a micro-service that aims to predict code language labels from a text

Context

This project was made as an example to deploy a machine learning algorithm in production

Maintainers

Prerequisites

  • Python 3.7
  • Conda

Installation

git clone https://github.com/rizerkrof/classification_codingLanguage.git
conda env create -n classification_codingLanguage -f environment.yml
conda activate classification_codingLanguage

Run flask micro-service application

python3 ./predict/predict/app.py

Call the application

The route is under http://localhost/predict. It is a POST route that needs a JSON body with 2 specific arguments:

  • textsToPredict : a list of string. Each string value correspond to the text you want to predict.
  • top_k : The limit number of the prediction you want to return

Example

We want to predict the labels associated to “please predict ruby” and “now php”.

Run the application and enter the following command in another terminal.

curl -X POST -H "Content-Type: application/json" -d '{"textsToPredict": ["please predict ruby", "now php"], "top_k":2}' http://localhost:5000/predict

The result is obviously:

["ruby-on-rails", "php"]

Run unit tests

At the root of the project:

Predict tests

python3 -m unittest discover --top-level-directory=. --start-directory=./predict/tests

Preprocessing tests

python3 -m unittest discover --top-level-directory=. --start-directory=./preprocessing/tests

Training tests

python3 -m unittest discover --top-level-directory=. --start-directory=./train/tests