
An introduction to data processing & visualization with numpy, pandas, matplotlib, and plotly express

Primary LanguageJupyter Notebook

Python tools for data science

A 10-lecture introductory course on data processing & visualization with numpy, pandas, matplotlib, and plotly.

The lectures are jupyter notebooks to be presented as slides using the rise extension. Each lecture is accompanied by exercises with solutions.

Lecture list

Number Folder Content
01 01_jupyter Introduction to Jupyter Notebook and python recap
02 02_numpy Introduction to numpy
03 03_pandas_intro Introduction to pandas
04 04_plotting_base Basic visualizations with matplotlib
05 05_pandas_processing Basic processing and cleaning with pandas
06 06_pandas_advanced More processing with pandas: groupby, pivot_table
07 07_plotting_advanced More visualizations with plotly express
08 08_example Two examples: video-game sales & global terrorism
09 09_pandas_data_wrangling Data wrangling with pandas: melt, pivot, concatenate, join
10 10_pandas_timeseries Time series: processing and visualization


A recent installation of Python (>=3.6) with numpy, scipy, pandas, matplotlib, plotly, jupyter and rise is required.

  • To install all the packages, type in a terminal:
$ pip install numpy scipy pandas matplotlib jupyter
$ pip install rise
$ pip install plotly

Quick start

  1. Open a terminal in the folder where you have downloaded the slides
  2. Start jupyter:
$ jupyter notebook
  1. Open the slides with jupyter.
  2. Click on the rise icon to go in presentation mode.