/tedsds

Apache Spark - Turbofan Engine Degradation Simulation Data Set example in Apache Spark

Primary LanguageJupyter NotebookApache License 2.0Apache-2.0

tedsds

Turbofan Engine Degradation Simulation Data Set example in Apache Spark

Uses the dataset from [1] to create a demostration of a machine learning setup for a predictive maintainance scenario for Turbofan Engines.

References:

  1. A. Saxena, K. Goebel, D. Simon, and N. Eklund, "Damage Propagation Modeling for Aircraft Engine Run-to-Failure Simulation", in the Proceedings of the Ist International Conference on Prognostics and Health Management (PHM08), Denver CO, Oct 2008., retrieved feb. 2016
  2. NASA Ames Prognostics data repository, retrieved feb. 2016, http://ti.arc.nasa.gov/tech/dash/pcoe/prognostic-data-repository/
  3. Major Challenges in Prognostics: Study on Benchmarking Prognostics Datasets, O. F. Eker1, F. Camci, and I. K. Jennions1, retrieved feb. 2016
  4. Big Data Analytics for eMaintenance : Modeling of high-dimensional data streams. / Zhang, Liangwei. Luleå : Luleå tekniska universitet, 2015. 46 p. (Licentiate thesis / Luleå University of Technology). Publication: Research › Licentiate thesis, retrieved feb. 2016
  5. Microsoft Cortana example with the same dataset, retrieved feb. 2016 Link
  6. H2o.io example with the same dataset, retrieved feb. 2016 Link Presentation
  7. Advanced Analytics with Spark - Patterns for Learning from Data at Scale By Sandy Ryza, Uri Laserson, Sean Owen, Josh Wills. Link Examples
  8. The use of the area under the ROC curve in the evaluation of machine learning algorithms,Andrew P Bradley Link
  9. A Few Useful Things to Know about Machine Learning, Pedro Domingos, Link

Spark libraries

  1. https://github.com/databricks/spark-csv

SBT plugins

  1. https://github.com/databricks/sbt-spark-package
  2. https://github.com/sbt/sbt-git