patryk.laskowski@ibm.com


1a. Python PMML supporting libraries

But supports only PMML-4_1,
and we have PMML-4_3.

❌ REJECTED

GitHub page.
But hard to setup.

❌ REJECTED

But there is no free license.

❌ REJECTED

But parse pmml files and convert it to sklearn kmeans models.

❌ REJECTED

But supports only few models (Decision Trees, Random Forests (ensemble method), Linear Model (specifically: Linear Reg, Ridge, Lasso, ElasticNet), Gaussian Naive Bayes)
Almost good...

❌ REJECTED

Which is simple, supports almost all models on all PMML versions and works!

✅ ACCEPTED


1b. R PMML supporting libraries

GitHub Supported models
Supports wide range of different model types (Anomaly Detection, Clustering, K Nearest Neighbors, Linear Models, etc.)

But this is PMML reader for R language.

This package contains functions to export various machine learning and statistical models to PMML, as well as generate data transformations in PMML format.

❌ REJECTED

R package for converting R models to PMML. Requires Java.

❌ REJECTED


1c. Java PMML supporting libraries

Java Evaluator API for Predictive Model Markup Language (PMML)

Supported versions: PMML versions 3.0, 3.1, 3.2, 4.0, 4.1, 4.2, 4.3 and 4.4.
Wide range of models supported (Association rules, Cluster model, General regression, Naive Bayes, k-Nearest neighbors, Neural network, etc.)

✅ ACCEPTED



2. Python PyPmml [main solution]

What is PyPmml?

PyPMML is a Python PMML scoring library, it really is the Python API for PMML4S.

What is PMML4S?

PMML4S is a PMML (Predictive Model Markup Language) scoring library for Scala. It provides both Scala and Java Evaluator API for PMML. PMML4S is a lightweight, clean and efficient implementation based on the PMML specification from 2.0 through to the latest 4.4.

Example PyPmml code.

Below code load PMML decision tree model and make prediction.

>>> from pypmml import Model

>>> model = Model.load('Waga.xml')
>>> model.predict({'Wzrost': 180})
{'$RI-Waga': 24, '$R-Waga': 65.32}
>>> model.inputNames
['Wzrost']
>>> model.predict([180])
[65.32, 24]
>>> model.outputNames
['$R-Waga', '$RI-Waga']

Data in Pandas DataFrame form is also accepted.

>>> import pandas as pd

>>> data = pd.DataFrame({
...     'Wzrost' : [180, 170, 160]})

>>> model.predict(data)
     $R-Waga  $RI-Waga
0  65.320000        24
1  57.190909        40
2  51.010526        17

3. Support for model types

Note

In this case PMML models are going to be produced in SAS.
Below compare all possible models that PMML supports with all models that SAS may export in PMML format with all models that PyPmml Python library as well as JPmml support.


Idx All PMML models SAS Enterprise Miner
[PMML Producer]
SAS Enterprise Miner 14.3
[PMML Consumer/Producer]
PyPMML
[Python Library]
JPMML-Evaluator
[Java Evaluator]
1 AnomalyDetectionModel - - ✅ Anomaly Detection Models -
2 AssociationModel ✅ Association Rules - ✅ Association Rules ✅ Association rules
3 BayesianNetworkModel - - ❌ Bayesian Network -
4 BaselineModel - ✅ Baseline models
[???]
❌ Baseline Models -
5 ClusteringModel ✅ Clustering ✅ Clustering models ✅ Cluster Models ✅ Cluster model
6 GaussianProcessModel - - ❌ Gaussian Process -
7 GeneralRegressionModel - - ✅ General Regression ✅ General regression
8 MiningModel - - -
9 NaiveBayesModel - ✅ Naïve Bayes
[PMML Consumer]
✅ Naive Bayes ✅ Naive Bayes
10 NearestNeighborModel - - ✅ k-Nearest Neighbors ✅ k-Nearest neighbors
11 NeuralNetwork ✅ Neural Networks ✅ Neural Networks ✅ Neural Network ✅ Neural Networks
12 RegressionModel ✅ Linear Regression
✅ Logistic Regression
✅ Regression ✅ Regression ✅ Regression
13 RuleSetModel - - ✅ Ruleset ✅ Rule set
14 SequenceModel - - ❌ Sequences -
15 Scorecard - ✅ Scorecard
[PMML Consumer]
✅ Scorecard ✅ Scorecard
16 SupportVectorMachineModel - ✅ Vector Machine
[PMML Consumer]
✅ Vector Machine ✅ Support Vector Machine
17 TextModel - - ❌ Text Models -
18 TimeSeriesModel - - ❌ Time Series ✅ Time Series
19 TreeModel ✅ Decision Trees ✅ Trees ✅ Trees ✅ Tree Model

3a. All PMML models

Documentation points these models as supported by PMML format.

IMAGE ALT TEXT HERE


3b. SAS Enterprise Miner

According to this website SAS Enterprise Miner can produce this models:

IMAGE ALT TEXT HERE


3c. SAS Enterprise Miner 14.3 PMML Support

Website. Look for "Supported PMML Models".

SAS PROC PSCORE supports the following types of PMML models:

IMAGE ALT TEXT HERE

This means that SAS in general is able to handle either import or export of above.


3d. Python PyPmml

Supports following models:

IMAGE ALT TEXT HERE


3e. Java jpmml-evaluator

Evaluate:

IMAGE ALT TEXT HERE


Summary

PyPmml is a good solution that supports wide range of PMML versions and all desired models except Baseline Models. This seems to be a great choice for this project!


patryk.laskowski@ibm.com