/feedzai-openml-r

Implementations for Feedzai's OpenML APIs to allow for usage of machine learning models in the R programming language.

Primary LanguageJavaOtherNOASSERTION

Feedzai OpenML Provider for R

Build Status codecov Codacy Badge

Implementations of the Feedzai OpenML API to allow support for machine learning models in the R programming language using RServe.

Modules

Generic R

Maven metadata URI

The openml-generic-r module contains a provider that allows developers to load R code that conforms to a simple API. This is the most powerful approach (yet more cumbersome) since models can actually hold state.

The provider can be pulled from Maven Central:

<dependency>
  <groupId>com.feedzai</groupId>
  <artifactId>openml-generic-r</artifactId>
  <!-- See project tags for latest version -->
  <version>0.4.0</version>
</dependency>

Caret

Maven metadata URI

The implementation in the openml-caret module adds support for models built with Caret.

This module can be pulled from Maven Central:

<dependency>
  <groupId>com.feedzai</groupId>
  <artifactId>openml-caret</artifactId>
  <!-- See project tags for latest version -->
  <version>0.4.0</version>
</dependency>

Building

This is a Maven project which you can build using

mvn clean install

Prerequisites for running tests

To use these providers you need to have R Project installed in your environment. After installing R, you need to install the R packages that the provider uses. The easiest way is to install them from CRAN.

Note that this section only describes the known prerequisites that are common to any model generated in R. Before importing a model you need to ensure that the required packages for that model are also installed.

Finally you must install Rserve.

Example in CentOS7:

Execute the following bash commands:

# repo that has R
yum -y install epel-release;

# needed for R dependencies
yum -y install libcurl-devel openssl-devel gsl-devel libwebp-devel librsvg2-devel R;

# start R
R

Execute the following R instructions:

# Load caret
install.packages("caret", dependencies=TRUE, repos = "http://cran.radicaldevelop.com/")

# Load all classification model implementations
# https://topepo.github.io/caret/available-models.html
# https://github.com/tobigithub/caret-machine-learning/wiki/caret-ml-setup
library(caret)
modNames <- unique(modelLookup()[modelLookup()\$forClass,c(1)])
install.packages(modNames, dependencies=TRUE, repos = "http://cran.radicaldevelop.com/")

# Load Rserve (needed for Pulse <-> R communication)
install.packages("Rserve", dependencies=TRUE, repos = "http://cran.radicaldevelop.com/"})

Docker

Feedzai has built a helpful docker image for testing, available on docker hub, that is being used in this repository's continuous integration. See the travis-ci configuration commands on how to use it.