My talk for the CS Tools Tips and Tricks Seminar: https://www.cs.mcgill.ca/events/251/
Prerequisite: https://docs.python.org/3/tutorial/
Install Jupyter: https://jupyter.org/install
Install JupyterLab: https://github.com/jupyterlab/jupyterlab#installation
Overview: This will be an introduction to using Jupyter Notebooks(/Lab) and Pandas to do some data analysis. We will also explore some tips and tricks for fitting these tools into a research workflow.
-
To introduce you to some tools you might want to know if you're looking for a job that requires you to work with data.
-
To provide some tools that make you a more productive researcher when making illustrations, tables and artifacts, and why it makes sense to put this in Jupyter.
-
Introduction to Jupyter Notebooks.
- What is Jupyter?
- Some code examples in Jupyter
-
Quick Introduction to Pandas
- What is Pandas?
- Some pandas examples
-
Some thoughts on when to use Jupyter and Pandas
-
The actual CS Tools and Tricks, see
Tips.ipynb
-
Demo of JupyterLab on CalculQuebec infrastructure
-
An example data analysis
Some things I haven't found a good solution to yet (suggestions welcome):
-
Version controlling notebooks
-
Writing tests in notebooks
-
I don't like notebooks.- Joel Grus (Allen Institute for Artificial Intelligence): https://www.youtube.com/watch?v=7jiPeIFXb6U
-
JupyterLab page (click "try on binder" for a demo) https://jupyterlab.readthedocs.io/en/stable/
-
JupyterHub https://wiki.calculquebec.ca/w/JupyterHub
-
Google Colab: https://colab.research.google.com
-
FiveThirtyEight Datasets: https://github.com/fivethirtyeight/data
-
ReviewNB: https://www.reviewnb.com/
-
Removed note about
.where()
-
Added sizing reference with
.tight_layout()
-
Talked about flattening columns with
.ravel()
: https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.ravel.html