DS 1.1: Data Analysis & Visualization

Course Description

In this course, students learn the foundational skills of data science, including data collection, scrubbing, analysis, and visualization with modern tools and libraries. Students gain a strong grounding in statistical concepts, utilize statistical techniques and master the science and art of data exploration and visualization to tell stories and persuade decision makers with data-driven insights.

Prerequisites:

CS 1.2

Learning Outcomes

By the end of this course, students will be able to...

  1. Conduct data manipulation and visualization
  2. Understand when to reject or accept a null hypothesis
  3. Apply descriptive statistics, probability, and other forms of data analysis techniques
  4. Describe and implement a plan for finding and dealing with problems in a dataset such as null values and outliers
  5. Perform statistical analysis on data collections using a variety of methods

Schedule

Course Dates: Tuesday, January 21 – Thursday, March 5, 2020 (7 weeks)

Class Times: Tuesday and Thursday at 3:30–5:20pm (14 class sessions)

Class Date Topics
1 Tue, January 21 Introduction to Data Science
2 Thu, January 23 Simple Data Manipulation
3 Tue, January 28 Data Manipulation & Visualization
4 Thu, January 30 How to Combine DataFrames
5 Tue, February 4 Applied Descriptive Statistics
6 Thu, February 6 Applied Probability to data frame
7 Tue, February 11 [NPS Project Data Wrangling Check-in]
8 Thu, February 13 PDFs, CDFs, and Normal Distributions
9 Tue, February 18 Hypothesis Testing & Acceptable Error
10 Thu, February 20 Confidence Intervals, Outliers, and Statistical Analysis
11 Tue, February 25 Time Series Data & Applications
12 Thu, February 27 [Lesson 12]
13 Tue, March 3 Final Exam
14 Thu, March 5 Presentations

Assignment Schedule

[INSTRUCTOR NOTE] REPLACE THE BELOW WITH LINKS TO YOUR ASSIGNMENTS, CORRECT DATES, AND SUBMISSION FORM

Assignment Date Assigned Due Date Submission Form
Midterm Project - NPS Data Analysis Thu, January 30 Tue, February 11 Submit Assignment
Homework 1 - Histogram Thu, February 13 Thu, February 20 Submit Assignment
Link to Assignment day, Date day, Date Submit Assignment
Link to Assignment day, Date day, Date Submit Assignment

Class Assignments

  • Implement a dataset processing with Numpy only and then Pandas
  • Write a function that calculate conditional probability for two arbitrary attributes and arbitrary condition

Tutorials

Students will complete the following guided tutorials in this course:

Projects

Students will complete the following self-guided projects in this course:

Evaluation

To pass this course you must meet the following requirements:

  • Do all in-class activities and one homework
  • Finish all required tutorials and two projects
  • Pass the final exam (summative assessment). The topics for final exam would be:
    • Null hypothesis, the steps to accept or reject it
    • Statistical terms and meanings such as Z-distribution, CDF, SF, ...
    • Histogram, density estimations
    • Outlier detection
    • Correlation

Make School Course Policies