labelstudio_ml_backend_simple_text_classifier: A Python repository from napoler

Quickstart

Build and start Machine Learning backend on http://localhost:9090

docker build -t ml_backend_simple_text_classifier .

docker-compose up

docker run -d -it -p 9090 -v mldata:/app/ napoler/labelstudio_ml_backend_simple_text_classifier:latest

docker run -d -it -p 9090 -v mldata:/app/ napoler/labelstudio_ml_backend_simple_text_classifier

构建并测试

docker build -t condatest .

测试运行

docker run --rm -it napoler/labelstudio_ml_backend_simple_text_classifier:latest label-studio-ml start /app

docker run --rm -it -p 0.0.0.0:9090:9090 napoler/labelstudio_ml_backend_simple_text_classifier:latest label-studio-ml start /app

label-studio-ml start /app --host=0.0.0.0 --port 9091

Check if it works:

$ curl http://localhost:9090/health
{"status":"UP"}

Then connect running backend to Label Studio:

label-studio start --init new_project --ml-backends http://localhost:9090 --template image_classification

Writing your own model

Place your scripts for model training & inference inside root directory. Follow the API guidelines described bellow. You can put everything in a single file, or create 2 separate one say my_training_module.py and my_inference_module.py
Write down your python dependencies in requirements.txt

Open wsgi.py and make your configurations under init_model_server arguments:

from my_training_module import training_script
from my_inference_module import InferenceModel

init_model_server(
    create_model_func=InferenceModel,
    train_script=training_script,
    ...

Make sure you have docker & docker-compose installed on your system, then run
```
docker-compose up --build
```

API guidelines

Inference module

In order to create module for inference, you have to declare the following class:

from htx.base_model import BaseModel

# use BaseModel inheritance provided by pyheartex SDK 
class MyModel(BaseModel):
    
    # Describe input types (Label Studio object tags names)
    INPUT_TYPES = ('Image',)

    # Describe output types (Label Studio control tags names)
    INPUT_TYPES = ('Choices',)

    def load(self, resources, **kwargs):
        """Here you load the model into the memory. resources is a dict returned by training script"""
        self.model_path = resources["model_path"]
        self.labels = resources["labels"]

    def predict(self, tasks, **kwargs):
        """Here you create list of model results with Label Studio's prediction format, task by task"""
        predictions = []
        for task in tasks:
            # do inference...
            predictions.append(task_prediction)
        return predictions

Training module

Training could be made in a separate environment. The only one convention is that data iterator and working directory are specified as input arguments for training function which outputs JSON-serializable resources consumed later by load() function in inference module.

def train(input_iterator, working_dir, **kwargs):
    """Here you gather input examples and output labels and train your model"""
    resources = {"model_path": "some/model/path", "labels": ["aaa", "bbb", "ccc"]}
    return resources