/navyfcu-ml

Notebooks and data for Machine Learning course.

Primary LanguageHTMLMIT LicenseMIT

Generalized Machine Learning

This repository contains notebooks, data, and slides for the survey of generalized machine learning and distributed computing training from September 14, 2018 - September 28, 2018. During this three day course, we will cover the following topics:

Day One:

  • ML Review: Generalized ML and Spatial Learning, Bias/Variance Tradeoff, Model Selection Triple
  • Regularized Regression: LASSO vs Ridge; ElasticNet and more
  • Clustering: Partitive vs Agglomerative Clustering; clustering evaluation methods, visualization
  • Classification I: Instance and Inductive Models (kNN, Decision Trees, Ensembles of Trees)

Day 2:

  • Classification II: Parametric Models: SVMs, Bayesian Models, Logistic Regression
  • Dimensionality Reduction and Manifolds: PCA, SVD, tSNE, Isomaps
  • Neural Networks I: Multi-Layer Perceptrons
  • Neural Networks II: Deep Learning and Tensorflow

Day 3:

  • Introduction to Spark: RDDs and Architecture
  • Programming Spark - interactive analysis and distributed jobs
  • Using Spark for data analysis: Spark SQL and Spark DataFrames
  • Spark for distributed ML: Spark MLlib