/DSCI_511_prog-dsci

DSCI 511: Programming for Data Science

Primary LanguageJupyter Notebook

DSCI 511: Python Programming for Data Science

Program design and data manipulation with Python. Overview of data structures, iteration, flow control, program design, and using libraries for data exploration and analysis.

Course Learning Outcomes

Click to expand!

By the end of the course, students are expected to:

  1. Translate fundamental programming concepts such as loops, conditionals, etc into Python code.
  2. Understand the key data structures in Python.
  3. Understand how to write functions in Python and assess if they are correct via unit testing.
  4. Know when and how to abstract code (e.g., into functions, or classes) to make it more modular and robust.
  5. Produce human-readable code that incorporates best practices of programming, documentation, and coding style.
  6. Use NumPy perform common data wrangling and computational tasks in Python.
  7. Use Pandas to create and manipulate data structures like Series and DataFrames.
  8. Wrangle different types of data in Pandas including numeric data, strings, and datetimes.

Specific learning objectives can be found in the Lecture Learning Objectives document.

Lectures

The table below shows the general lecture outline; see the Lecture Learning Objectives document for lecture-specific learning objectives.

Lecture Topic Optional Pre-readings Practice exercises
1 Basics WTP: Section 3 - Section 7
2 Loops & Functions WTP: Section 8 - Section 13
PEP 257: Docstrings
NumPy docstring examples
3 Unit Tests & Classes Python documentation: 9. Classes
Think Python: "Classes and objects", "Classes and functions", "Classes and methods"
4 Style Guides, Scripts, Imports PEP 257: Style Guide
Getting Started with Python in VS Code up to "Run Hello World"
Python documentation: 5. The import system
5 Introduction to NumPy PDSH: Introduction to Numpy
Numpy documentation: Quickstart tutorial
6 Introduction to Pandas PDSH: Data Manipulation with Pandas up to "Operating on Data in Pandas"
Pandas documentation: 10 minutes to pandas, up to "Selection"
7 Basic Data Wrangling with Pandas PDSH: Data Manipulation with Pandas
Pandas documentation: 10 minutes to pandas
8 Advanced Data Wrangling with Pandas PDSH: Data Manipulation with Pandas
Pandas documentation: 10 minutes to pandas

Labs and Quizzes

You are responsible for the following deliverables, which will determine your course grade:

Assessment Weight Due Date Location
Lab Assignment 1 15% Sunday, Sept 13 at 18:00 Submit to Github & Canvas
Lab Assignment 2 15% Saturday, Sept 19 at 18:00 Submit to Github & Canvas
Quiz 1 20% Tuesday, Sept 22 at 14:00 Online
Lab Assignment 3 15% Saturday, Sept 26 at 18:00 Submit to Github & Canvas
Lab Assignment 4 15% Saturday, Oct 3 at 18:00 Submit to Github & Canvas
Quiz 2 20% Tuesday, Oct 6 at 10:00 Online

Quizzes will be held in week 3 and week 5, are open book and are typically 30 mins long with a focus on short-answer questions. More information on quizzes will be provided closer to their dates.

Optional Additional Reference/Learning Materials