Wellington, Wednesday 26 November 2014
Contact: Nicolas Fauchereau
--
For this tutorial, I recommend installing the Anaconda Python distribution. It is a completely free enterprise-ready Python distribution for large-scale data processing, predictive analytics, and scientific computing. It includes the python interpreter itself, the python standard library as well as a set of packages exposing data structures and methods for data manipulation and scientific computing and visualization. In particular it provides Numpy, Scipy, Pandas, Matplotlib, scikit-learn and statmodels, i.e. all the main packages we will be using during the tutorial. The full list of packages is available at:
http://docs.continuum.io/anaconda/pkgs.html
The Anaconda python distribution must be downloaded from:
For your platform.
Once you have installed Anaconda, you can update to the latest compatible versions of all the pre-installed packages by running:
$ conda update conda
Then
$ conda update anaconda
In a terminal.
You also need to install pip to install packages from the Python Package Index.
$ conda install pip
While we might not have the time to cover them in depth during the tutorial, I would recommend that you have a look at a few extra libraries.
Basemap is a graphic library for plotting (static, publication quality) geographical maps (see http://matplotlib.org/basemap/). Basemap is available directly in Anaconda using the conda package manager, install with:
$ conda install basemap
Bokeh is a new interactive plotting library developed by the team behind anaconda: it is thus installable with conda (if not already installed):
$ conda install bokeh
seaborn is a Python visualization library based on matplotlib. It provides a high-level interface for drawing attractive statistical graphics. You should be able to install it with conda
as well:
$ conda install seaborn
mplD3 aims at bringing matplotlib to the browser. It has been developed by Jake VanDerPlas. It is also by pip
:
$ pip install mpld3
bearcart has been developed by Rob Story and provides an interface to the rickshaw JavaScript library. It is also installable via pip
:
$ pip install bearcart
folium has been also been developed by Rob Story to provide an interface to the leaflet.js JavaScript mapping library. Install with:
$ pip install folium
The material of the tutorial is in the form of IPython notebooks. In a nutshell an IPython notebook is a web-based (i.e. running in the browser) interactive computational environment where you can combine Python code execution, text, mathematics, plots and rich media into a single document, which makes it an ideal medium for teaching and exploring.
After uncompressing the archive of the repo (or after cloning it with git
), navigate to the corresponding directory (containing the *.ipynb
files) and type:
$ ipython notebook
That should bring up the IPython notebook dashboard, you should be ready to go !
You should see in particular a test.ipynb
notebook: please run it to make sure all the necessary libraries have been installed correctly. If you followed the instructions above (install the anaconda python distribution) it should be fine, this test notebook is mostly intended for those who have a custom python installation.