photonai: A Python repository from AayushGrover

PHOTONAI is a high level python API for designing and optimizing machine learning pipelines.

We create a system in which you can easily select and combine both pre-processing and learning algorithms from state-of-the-art machine learning toolboxes, and arrange them in simple or parallel pipeline data streams.

In addition, you can parametrize your training and testing workflow choosing cross-validation schemas, performance metrics and hyperparameter optimization metrics from a list of pre-registered options.

Importantly, you can integrate custom solutions into your data processing pipeline, but also for any part of the model training and evaluation process including ucstom hyperparameter optimization strategies.

For a detailed description, visit our website and read the documentation

or you can read a prolonged introduction on Arxiv

Getting Started

In order to use PHOTON you only need to have your favourite Python IDE ready. Then install the latest stable version simply via pip

pip install photonai
# Or try out the latest features if you don't rely on a stable version, using:
pip install --upgrade git+https://github.com/wwu-mmll/photonai.git@develop

You can setup a full stack machine learning pipeline in a few lines of code:

from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import KFold

from photonai.base import Hyperpipe, PipelineElement, OutputSettings
from photonai.optimization import FloatRange, Categorical, IntegerRange

# DESIGN YOUR PIPELINE
my_pipe = Hyperpipe('basic_svm_pipe',  # the name of your pipeline
                    # which optimizer PHOTON shall use
                    optimizer='sk_opt',
                    optimizer_params={'n_configurations': 10},
                    # the performance metrics of your interest
                    metrics=['accuracy', 'precision', 'recall', 'balanced_accuracy'],
                    # after hyperparameter optimization, this metric declares the winner config
                    best_config_metric='accuracy',
                    # repeat hyperparameter optimization three times
                    outer_cv=KFold(n_splits=3),
                    # test each configuration five times respectively,
                    inner_cv=KFold(n_splits=5),
                    verbosity=1,
                    output_settings=OutputSettings(project_folder='./tmp/'))


# first normalize all features
my_pipe.add(PipelineElement('StandardScaler'))

# then do feature selection using a PCA
my_pipe += PipelineElement('PCA', 
                           hyperparameters={'n_components': IntegerRange(5, 20)}, 
                           test_disabled=True)

# engage and optimize the good old SVM for Classification
my_pipe += PipelineElement('SVC', 
                           hyperparameters={'kernel': Categorical(['rbf', 'linear']),
                                            'C': FloatRange(0.5, 2)}, gamma='scale')

# train pipeline
X, y = load_breast_cancer(return_X_y=True)
my_pipe.fit(X, y)

Features

Easy access to established ML implementations

We pre-registered diverse preprocessing and learning algorithms from state-of-the-art toolboxes e.g. scikit-learn, keras and imbalanced learn, which you can choose to rapidly build custom pipelines

Hyperparameter Optimization

With PHOTONAI you can seamlessly switch between diverse hyperparameter optimization strategies, such as (random) grid-search or bayesian optimization ([scikit-optimize](https://www.photon-ai.com/documentation/content-guide/skopt, smac3).

Extended ML Pipeline

You can build custom sequences of processing and learning algorithms with a simple syntax. PHOTONAI offers extended pipeline functionality such as parallel sequences, custom callbacks in-between pipeline elements, AND- and OR- Operations, as well as the possibility to flexibly position data augmentation, class balancing or learning algorithms anywhere in the pipeline.

Model Sharing

PHOTONAI provides a standardized format for sharing and loading optimized pipelines across platforms with only one line of code.

Automation

While you concentrate on selecting appropriate processing steps, learning algorithms, hyperparameters and training parameters, PHOTONAI automates the nested cross-validated optimization and evaluation loop for any custom pipeline.

Results Visualization

PHOTONAI comes with extensive logging of all information in the training, testing and hyperparameter optimization process. In addition, optimum performances and the hyperparameter optimization progress are visualized in the PHOTONAI Explorer.

Examples

How to Handle Imbalanced Classes

Based on the popular imbalanced-learn library, learn how to handle class imbalance in your custom pipeline. See example code

Competing Learning Algorithms

In case you are wondering which learning algorithm fits your data well, let the hyperparameter optimization strategy compare different learning algorithms using OR-Element. See example code

Save and share the optimized model

Learn how to share your PHOTONAI model with the world, receive external validation and make other people use it in two lines of code. See example code

AayushGrover/photonai