/cdsw-simple-serving

Modeling Lifecycle with ACME Occupancy Detection and Cloudera

Primary LanguageScalaApache License 2.0Apache-2.0

Modeling Lifecycle with ACME Occupancy Detection and Cloudera

Data science is more than just modeling. The complete data science lifecycle also includes data engineering and model deployment. This project offers a simplified yet credible example of all three elements, as implemented using Apache Spark, the Cloudera Data Science Workbench, and JPMML / OpenScoring.

In this project, the ACME corporation is productionizing a connected-house platform. Part of this service requires predicting the occupancy of a room given sensor readings.

This example project includes simplified examples of:

  • Data Engineering
    • Ingest
    • Cleaning
  • Data Science
    • Modeling
    • Tuning and evaluation
  • Model Serving
    • Model management
    • Testing
    • REST API

Requirements

Get Started

To continue, review documentation for each of the three modules, which contains more information about what it show and how to run it.

Build Status