/LambdaSchoolDataScience

Completed assignments and coding challenges from the Lambda School Data Science program.

Primary LanguageJupyter NotebookMIT LicenseMIT

Lambda School Data Science with Machine Learning

Completed assignments and coding challenges from the Lambda School Data Science program.

Environment

The majority of the work from the first half of the curriculum is done within IPython notebooks run in Google's Colaboratory environment. No dependencies need to be installed on a local machine to save and work with these notebooks in Google Drive.

For assignments and coding challenges requiring installations on the local machine, instructions are provided with the specific week, or assignment. Generally, most programming is done in Python 3. I used Anaconda to create virtual environments for each new assignment in order to keep my root installation clean.

Articles Published while at Lambda School

Organization

Lambda School's instruction is divided into 5-day sprints. Each sprint has an overall topic, and will contain one or more "modules" - more specific subtopics. For the first four days, there are short code challenges meant to introduce the material, as well as more in-depth assignments. Occasionally, some days will have no code challenge in favor of some other assignment or reading.

Each sprint is capped off with a comprehensive challenge that covers the breadth of the material, but not in as much depth as the weekly assignments. The material of the Sprint Challenge corresponds directly to the "learning objectives" of each module. The code challenges and assignments for a particular week can be found in the directory for that week.

Animations and Interactivity

GitHub's rendering of notebooks does not include animations rendered in the notebook outputs as JavaScript widgets. To view these, and certain other interactive elements, use nbviewer.

http://nbviewer.jupyter.org/

Syllabus

Mathematical Foundations

  • Functions and Optima
  • Linear Algebra

Data and Visualization

  • Data Preparation: An Overview
  • Data Visualization

High Dimensionality

  • High Dimensionality
  • Dimensionality Reduction

Presenting

  • Presenting for the Public: LaTeX and d3
  • Building an ML Portfolio

Probability and Statistics

  • Quantitative Data Analysis
  • Graphical Data Analysis
  • Statistical Techniques

Modeling

  • Linear Regression
  • Logistic Regression

The Machine Learning Framework

  • Model Tuning
  • Supervised Learning

Unsupervised Learning

  • Clustering
  • Association Rule Learning
  • Collaborative Filtering

Neural Networks

  • Neural Networks

Computer Vision

  • Computer Vision
  • Deep CNNs

Natural Language Processing

  • Natural Language Processing - Introduction
  • Comparing Documents or Words
  • Sentiment Analysis
  • Accessing and Building Corpuses

Spark

  • Spark

Productizing ML

  • SQL
  • Flask
  • Microsoft Azure ML Studio

Reinforcement Learning

  • Docker
  • Reinforcement Learning with OpenAI Gym
  • Object Detection

Data Structures and Algorithms

  • Data Structures
  • Algorithms Overview

Graphs

  • Introduction to Graphs and Bokeh
  • Connected Components
  • Search

Intro to C and Operating Systems

  • Introduction to C
  • Operating Systems