/data-viz-smm635

Primary LanguageJupyter Notebook

Data Visualization, SMM635 ― README

Instructor

  • Dr. Simone Santoni ― simone.santoni.1@city.ac.uk
  • Office hour: every Thursday from 17:30 to 18:30 (students are required to book a slot and share their questions via email in advance).
  • Teaching assistant: Dr. Feng Zhou ― Feng.Zhou@city.ac.uk

Module Overview

In the digital era, we are bombarded by humongous streams of information. Sometimes, these 'inputs' are not interesting enough to breach the defensive barriers that protect our attention ― hence, we simply decide to pass over. Other times, what we are exposed to seems valuable but we do not see any neat, meaningful pattern in it (that's the bottom line as it comes to crunch massive datasets!!). According to the popular idiom "one picture is worth a thousand words", all what messengers and audiences need to 'click' is ― often-times ― just a simple, effective visualization.

This module is a journey into the art and science of creating memorable charts, which grab the attention of the audience and successfully convey insights and narratives. Pleasant journeys require good companions: 'infographic' ― i.e., the field that investigates the representation of information, data, and knowledge ― will offer the theoretical platform for the module; Python will make things happen (at least on our screen). Consistently with the teaching philosophy of the module leader, each individual lecture integrates 'theory' and 'practice'.

Materials & Readings

Mandatory materials/readings will be circulated via Moodle in advance (every Friday – i.e., one week before the class). For this module you are not required to purchase any expensive book, whereas it is essential you carefully go through:

  • lecture notes (to be uploaded onto Moodle/Github weekly; they include slideshows + videos);
  • case studies (to be uploaded onto Moodle/Github weekly);
  • Python scripts/Jupyter notebooks (to be uploaded onto Moodle/Github or week* directories weekly);
  • journal articles (to be uploaded onto Moodle/Github weekly).

Discretionary readings/materials students may want to reference to are:

Learning Objectives

At the end of the module, students should be able to:

  • generate and evaluate visual forms for appropriateness, context, and meaning;

  • design and execute statical charts with a particular emphasis on massive datasets;

  • design and execute complex visualizations involving timeliness and geospatial attributes;

  • design and execute interactive visualizations;

  • leverage the visualization capabilities of the Python libraries Matplotlib and Bokeh.

Assessment

As per the module specification, students will be assessed on the basis of coursework submissions, which all are the outcome of group-level efforts (yes, you understand correctly, for this module there is no final examination and you are not supposed to deliver any assignment on your own). Specifically, there are three pieces of coursework, namely:

  • a 'mid-term project' (MTP)
  • a 'final course project' (FCP)
  • two case studies (CS)

All types of assessment will be evaluated along four criteria: i) appropriate use of notions and frameworks discussed in class; ii) effectiveness of the proposed answer or solution; iii) originality/creativity of the proposed answer or solution; iv) organization an clarity of submitted materials. All criteria carry-out equal weight in terms of mark.

Mid-term project

For the MTP, students are required to solve a complex visualization problem. The details of the MTP will be available by week 3, when the project will be released. Submissions will be assessed on a 0 - 100% scale. The Groups who fail MTP can resubmit a revised version of the project; if the revision is sufficient, students receive a 50% mark. The deadline for the project is November 11 (week 6). Selected groups will be invited to present the outcome of their work to fellow students in week 6. Invited groups could also receive a maximum of 3 bonus points on the basis of the quality of their presentations.

Final course project

With the FCP , groups make their hands 'dirty' as they help a real-world client to face some data visualization challenges. Details about the client and the challenge will be available in week 7. Final course projects will be evaluated on a rolling-based window and should be submitted by mid December (the course office will confirm the exact deadline shortly).

Case studies

Case studies provide students with the opportunity to learn how to integrate the 'business' and the 'data viz' perspectives in order to deal effectively with real-world problems. In terms of process, groups of students will receive i) a detailed description of a business issue, and ii) relevant data; then, they will be working for one week to produce their own solution. Solutions will be disclussed in class. Each presenting group will be associated with a discussant group whose role is to challenge the ideas, tools, and recommendetions that will be brought to the table. I expect six pairs of presenting and discussant groups; this means group that will present their solution in week 8 will serve as discussant in week 10 (and viceversa). Both presenting and discussant groups will be assessed.

Discretionary coursework

Problem sets will be launched weekly. Individual students may want to deal these problem sets and present their solution to the class. A maximum of three students per session will be selected on the basis of the novelty and effectiveness of the proposed solution. One bonus point (+1 FM) will be assigned.

Schedule of the Module

Week Topic
1 Introduction to the SMM635
Laboratory session on Python for data viz (Matplotlib)
2 Elements of infographic
- taste, aesthetics, and perceptions
- visual forms
- colors
- exemplars of visualization
Laboratory session on Python for data viz (Matplotlib)
3 Exploratory statistical charts
- frequencies
- univariate distributions
- bivariate distributions
- 3D distributions
Laboratory session on exploratory statistical charts (Matplotlib)
Mid-term project release
4 Time-dependent data
- timelines
- sequences of events
- narrative
Laboratory session on time-dependent data (Matplotlib)
5 Visualizing statical estimates and fits
- uncertainty in estimates
- plotting causal effects estimated via regression
Laboratory session on statistical estimates and fits
6 Mid-term project ― students' presentations
Laboratory session on visualizing statistical estimates (Matplotlib)
7 Geospatial maps
Laboratory sessions on geospatial maps (Fiona + Pyshp + Rasterio + Pyproj + Shapeley + Geopandas)
Launching the final-course project
8 Case study # 1
9 Interactive visualizations for the web
Laboratory session on interactive visualizations (Bokeh)
10 Case study # 2

Guest speakers

Throughout all the various weeks of the Term, SMM635 will host two types of guest speakers: ambassadors – former students of the BA program – and practitioners from several industries.

Prerequisites

The SMM692 ― Python Pre-Course module defines the knowledge students should possess in order to proficiently attend to SMM635 ― Data Visualization.

Software requirements

For this module you are supposed to run Python 3.7 or higher on your machine. Now, how to get Python work on your machine? There are several ways to do that. A fast, smooth alternative is to install Anaconda, an open source distribution of Python that includes: i) 250+ popular data science packages; ii) the conda package, which makes quick and easy to install, run, and upgrade complex data science and machine learning environments.

Here is the workflow:

  1. Use your preferred browser to open the link pointing to the Anaconda repository;

  2. Select the installer the which suits your machine (32- or 64-bit) and operating system (Win, Mac OS, Linux). Mac users may want to download the graphical installer rather than the command-line installer (students may feel less comfortable with);

  3. Retrieve the installer (perhaps in your download folder);

  4. Run the installer;

  5. Log-out from your current session (it does not matter if you use Win, Mac OS or Linux);

  6. Log-in into a new session;

  7. Run 'Anaconda Navigator'―namely, a convenient place to launch the IPython shell or other user-interfaces to interact with IPython.

On top of Anaconda/Python, students should install the modules Matplotlib, Bokeh, Fiona, Pyshp, Rasterio, Shapely (I recommend doing that with Conda).

Version history

  • Created: 28/09/2020, 07:39:49
  • Last change: no revisions so far