/msba-python-workshop-2020-07

Merage School of Business MSBA Python Workshop, July 20202

Primary LanguageJupyter Notebook

Merage School of Business MSBA Python Workshop, July 2020

Overview

MSBA Introduction to Python

This workshop is a joint collaboration between The Orange County R Users Group (OCRUG) and the UCI Paul Merage School of Business, Masters of Science in Business Analytics (MSBA)

In this workshop, a basic introduction to Python will be presented covering fundamentals of Python programming and practical data science skills using the pandas Python library.

Date: July 30 & 31, 2020

Time: 5 PM to 8 PM, Pacific

Location: Online (Zoom)

Course Logistics

  • 3 hours each day, 2 days total
  • Divide into 40 min sessions
    • 15 min instruction
    • 15 min practice in breakout rooms
    • Teaching assistants will be assigned to breakout rooms
    • 10 min review & questions
  • 4 sessions per day (160 min for sessions + 10 min break + 10 min wrap-up) *8 sessions total for both days

Schedule

The two-day workshop will be presented as 4 sessions each day, where each session is roughly divided into 15 minutes of instruction, 15 minutes of practice and exercies, and 10 minutes of review (~40 min per session).

Thursday July 30: Python Fundamentals

The focus will be on Python as a language, drawing from the python docs: https://docs.python.org/3/

  • Session 1: Using Jupyter Notebooks
    • What is a notebook, why are they useful?
    • Jupyter interface
    • Working with cells (creating, executing, cell types, etc)
    • Tips and best practices
  • Session 2: Review of Python Fundamentals
    • Importance of spacing
    • Expressions and variables
    • Math operations
    • Data types (numbers, strings, boolean)
    • Lists
  • Session 3: Control Flows
    • Conditional statements
    • Loops
  • Session 4: Functions
    • What are they and why are they important
    • Function syntax
    • How to write your own functions
    • Tips and best practices

Friday July 31: Applied Data Science Fundamentals with pandas

The focus will be on Pandas as the entry into data science specific tasks, drawing from the getting started tutorials: https://pandas.pydata.org/docs/getting_started/intro_tutorials/index.html

  • Session 5: Introduction to pandas
    • Why tabular data tables are useful for data science (compare to Excel)
    • Series and DataFrames
    • How to create Series and DataFrames
    • How to read Series and DataFrames from files
  • Session 6: Subsetting DataFrames
    • Selecting columns
    • Filtering rows
    • The various ways of indexing data frames (by labels, slices, conditional expressions), loc and iloc
  • Session 7: Reshaping and Merging DataFrames
    • Wide vs. long formats and converting between the two: pivot and melt
    • Grouped summaries, groupby
    • Concatenating tables by column and row: concat
    • Joining data tables: merge
  • Session 8: Data Visualization with pandas
    • Basic plotting from pandas: plot, scatter, box, hist, etc
    • Examples of more complex plots, coloring and grouping by variables
    • Tuning plot parameters (sizes, colors, layouts)
    • Saving plots (e.g. to use in presentations, etc)