/sdss-2019

Interpretable Machine Learning with rsparkling

Primary LanguageRApache License 2.0Apache-2.0

Interpretable Machine Learning with rsparkling

Presented at the 2019 Symposium on Data Science and Statistics

The rsparkling R package is an extension package for sparklyr (an R interface for Apache Spark) that creates an R front-end for the Sparkling Water Spark package from H2O. This provides an interface to H2O’s high performance, distributed machine learning algorithms on Spark, using R. The main purpose of this package is to provide a connector between sparklyr and H2O’s machine learning algorithms.

This presentation illustrates how one can use the rsparkling package to combine innovations from several sub-disciplines of machine learning research to train explainable, fair, trustable, and accurate predictive modeling systems. Together these techniques create a new and truly human-centered type of machine learning suitable for use in business- and life-critical decision support.

A code example showcasing methodology from this talk is available here