MLBazaar/MLPrimitives

Add MLPipeline as a primitive

csala opened this issue · 0 comments

csala commented

Create the mlblocks.MLPipeline primitive to allow setting a whole pipeline as a single primitive.

The primitive arguments will be the mlblocks.MLPipeline arguments, which means that the pipeline will be specified as either:

  • A pipeline name
  • A json path
  • A pipeline dictionary
  • Individual arguments

Some usage examples:

Python pipeline specification - specify the sub-pipeline by its name

from mlblocks import MLPipeline

primitives = [
    'mlblocks.MLPipeline',
    'xgboost.XGBClassifier'
]
init_params = {
    'mlblocks.MLPipeline#1': {
        'pipeline': 'my-preprocessing-pipeline'
    },
}
pipeline = MLPipeline(primitives, init_params=init_params)

Pipeline dictionary - specify the subpipelines by name, dict and json path

pipeline_dict = {
    'primitives': [
        'mlblocks.MLPipeline',
        'mlblocks.MLPipeline',
        'mlblocks.MLPipeline',
    ],
    'init_params': {
        'mlblocks.MLPipeline#1': {
            'pipeline': 'a-preprocessing-pipeline',
        },
        'mlblocks.MLPipeline#2': {
            'primitives': [
                'mlprimitives.feature_extraction.CategoricalEncoder',
                'sklearn.preprocessing.StandardScaler',
            ],
        },
        'mlblocks.MLPipeline#3': {
            'primitives': [
                'path/to/my/postprocessing/pipeline.json',
            ],
        },
    },
}