/DS-1.1-Data-Analysis

DS 1.1: Data Analysis & Visualization

Primary LanguageJupyter Notebook

DS 1.1: Data Analysis & Visualization

Course Description

In this course, students learn the foundational skills of data science, including data collection, scrubbing, analysis, and visualization with modern tools and libraries. Students gain a strong grounding in statistical concepts, utilize statistical techniques and master the science and art of data exploration and visualization to tell stories and persuade decision makers with data-driven insights.

Prerequisites:

CS 1.2

Learning Outcomes

By the end of this course, students will be able to...

  1. Conduct data manipulation and visualization
  2. Understand when to reject or accept a null hypothesis
  3. Apply descriptive statistics, probability, and other forms of data analysis techniques
  4. Describe and implement a plan for finding and dealing with problems in a dataset such as null values and outliers
  5. Perform statistical analysis on data collections using a variety of methods

Schedule

Course Dates: Tuesday, March 31 – Thursday, May 14, 2020 (7 weeks)

Class Times: Tuesday and Thursday at 9:30am to 12:15pm (14 class sessions)

Class Date Topics
1 Tue, March 31 Introduction to Data Science
2 Thu, April 2 Simple Data Manipulation
3 Tue, April 7 Data Manipulation & Visualization
4 Thu, April 9 How to Combine DataFrames
5 Tue, April 14 Applied Descriptive Statistics
6 Thu, April 16 Applied Probability to data frame
7 Tue, April 21 [NPS Project Data Wrangling Check-in]
8 Thu, April 23 PDFs, CDFs, and Normal Distributions
9 Tue, April 28 Hypothesis Testing & Acceptable Error
10 Thu, April 30 Confidence Intervals, Outliers, and Statistical Analysis
11 Tue, May 5 Time Series Data & Applications
12 Thu, May 7 [Lesson 12]
13 Tue, May 12 Final Exam
14 Thu, May 14 Presentations

Assignment Schedule

[INSTRUCTOR NOTE] REPLACE THE BELOW WITH LINKS TO YOUR ASSIGNMENTS, CORRECT DATES, AND SUBMISSION FORM

Assignment Date Assigned Due Date Submission Form
Midterm Project - NPS Data Analysis Tue, April 7 Tue, April 21 Submit Assignment
Homework 1 - Histogram Thu, April 23 Thu, April 30 Submit Assignment

Class Assignments

  • Implement a dataset processing with Numpy only and then Pandas
  • Write a function that calculate conditional probability for two arbitrary attributes and arbitrary condition

Tutorials

Students will complete the following guided tutorials in this course:

Projects

Students will complete the following self-guided projects in this course:

Evaluation

To pass this course you must meet the following requirements:

  • Do all in-class activities and one homework
  • Finish all required tutorials and two projects
  • Pass the final exam (summative assessment). The topics for final exam would be:
    • Null hypothesis, the steps to accept or reject it
    • Statistical terms and meanings such as Z-distribution, CDF, SF, ...
    • Histogram, density estimations
    • Outlier detection
    • Correlation

Make School Course Policies