ds-class-intro

Welcome to the intro-level DS class, where we will learn about python basics and how to use python for exploratory data analysis. Hope you'll enjoy the class and learn something from it.

Note: If you're getting the following error while cloning or pushing to GitHub, try to set up PAT following this instruction.

Error message: remote: Support for password authentication was removed on August 13, 2021. Please use a personal access token instead. remote: Please see https://github.blog/2020-12-15-token-authentication-requirements-for-git-operations/ for more information. fatal: unable to access 'https://github.com/emma-data-works/ds-class-intro.git/': The requested URL returned error: 403

0. Get started -

If you can, try to go throug the following reading and set up your local environment before class:

1. Python basics

You can run python in different settings, for example, you can use jupyter notebook for interactive exploration, use interpreter in command line by typing python in terminal (you'll see >>> prompt appear), or run python script in command line by python <your_script>.py. We will be using notebooks for the class as it's easy to follow with markdown and easy to interact with.

class01:

0. Environment set up (material in section 0)
1. Assign values to variables and simple arithmetics
2. `Print` and simple string manimulation

class02:

3. Value comparison and conditions using `if-elif-else`
4. Collections: list, tuple, set, and dictionary
*  Git - Commiting, Pushing, and Pull Request

Homework_01(Exercise0,3,4) is due next class. Please refer to homework submission instructions for how to open pull request for submission.

class03:

5. Iteration: loops and comprehensions
*  HW01 review [delayed]

class04

6. Writing functions

Homework_02 (Exercise 5, 6) is assigned, it's due next Wednesday 8/11, but we'll start discussion/review on 8/8. Sample answer is posted for reference.

class05

7. Reading and writing files
8. Intro to code complexity and performance (part1)

Sample answer for exercise 7 is posted for reference.

class06

9. Coding challenge example using HackerRank and LeetCode
10. [Material Provided] Objected oriented programming
11. A/B testing discussion

class07 Beginning of pandas for data analysis

1. Data exploration: Intro to `pandas` 
2. Data wrangling basics

class08

2. Data wrangling basics
3. Using `pandas` for exploratory data analysis

Homework_03 (all exercises in pandas section) is assigned, it's due before the final class, sample answer will be posted before final class.

class09

3. Using `pandas` for exploratory data analysis
4. Plotting in python

class10 dataset has been uploaded for preview

[5. Advanced EDA topics] -- depends on time
6. Mock take-home case study
Ref:

https://github.com/tdpetrou/Minimally-Sufficient-Pandas

https://github.com/cmawer/pycon-2017-eda-tutorial/blob/master/EDA-cheat-sheet.md

https://github.com/Tian-Su/Walmart_MI_ML_interview_campus