/Data_science_for_biologists_2019

Lab materials for the University of Washington course Data Science for Biologists (Winter 2019, BIO419/519)

Primary LanguageJupyter NotebookMIT LicenseMIT

Data Science for Biologists

License: MIT GitHub followers

Summary

This repository contains all lab materials for the University of Washington course Data Science for Biologists (Winter 2019, BIOL 419/519). Please feel free to use for any purpose.

Course design and lecture material (not included here) by Bingni Brunton and Kameron Harris. Lab materials by Eleanor Lutz, with helpful suggestions from Bing and Kam.

This 2019 course used default package versions downloaded with Anaconda 5.3.0: Pandas 0.23.4, Matplotlib 3.0.2, Numpy 1.15.4, and Scikit-Learn 0.20.2.

Helpful resources referenced throughout the course

Lab content

These labs were designed for students with no prior programming experience. Many sections of the code are intentionally written inefficiently, because students had not yet learned more advanced concepts (for loops, libraries, etc). A brief description of skills and topics covered in each lab is included below. Each lab is provided as a Jupyter Notebook both with and without answers, in addition to as PDF files (in the folders PDF_Lab_Keys and PDF_Labs).

Lab 1

  • Python data types
  • Conditional logic in Python
  • Looping over data in Python
  • Introduction to libraries

Lab 2

  • Numpy arrays
  • Importing data from a file into a Numpy array
  • Examining and plotting data in a Numpy array

Lab 3

  • Matrix algebra by hand (10-minute class exercise, not included)
  • Matrix algebra in Python
  • Functions

Lab 4

  • Reading in data using the Pandas library
  • Review of linear regression
  • Plotting in three dimensions

Lab 5

  • Inspecting and cleaning data in Pandas
  • Working with figure objects in Matplotlib
  • Joining two Pandas dataframes
  • Plots with multiple subplots
  • Plotting scatterpoints colored by group

Lab 6

  • Review of importing and inspecting data
  • Split a dataset into a training and test set
  • Train a machine learning classifier using scikit-learn

Lab 7

  • K-means clustering using scikit-learn
  • Custom Matplotlib legends