/anscombe

Introduction to summary statistics and data exploration

MIT LicenseMIT

Summary Statistics and Data Exploration

In this quick tutorial, we will explore how to use Pandas to do data exploration and form hypotheses on four different datasets, using the power combo of Pandas + Statsmodels (which is essentially R in Python syntax).

The tutorial is an iPython notebook for demonstration. If you have iPython installed, you can run the tutorial locally with the following command:

~$ ipython notebook anscombe.ipynb

If you don't have iPython, you can install it as follows:

~$ pip install ipython[all]

Or you can view the tutorial on nbviewer, which is not interactive.