pandas groupby-apply practices + data reshaping + intro to databases
This simple repository contains code and data used in Week 08 of 2023's edition of DS105A.
If you want to replicate the analysis in this notebook, you will need to:
-
Clone this repository to your computer.
-
Add it to your VS Code workspace.
-
Go to IMDb Non-Commercial Datasets page and download all
tsv.gz
files from there, place all of that under thedata/raw/
folder. This folder is gitignored, we don't want to push large data files to GitHub! -
Run:
pip install -r requirements.txt
-
Open the notebook and run the cells!