/Data-Visualization

Contains helpful material and resources for Math 495R sections 4 and 6 taught Winter 2017

Primary LanguageJupyter Notebook

Data Visualization

Thursday Portfolio Reviews Google Sheet

Portfolio Projects

  • Redesign an ACME lab
  • Complete a visual data analysis project and write up on Medium

March 21: Guest Lecture: Doug Thomas

Meet in the HFAC room A460

March 14: Presentations

Read Tufte's The Cognitive Style of Powerpoint

Basically, handouts are better than slides 90% of the time.

Homework: keep working on your projects

March 7: Data Narratives

Introduction, Conflict, Resolution

Science Isn't Broken

2016 Election Coverage

Stories provide context. They are a mental model that helps us remember information. Facts are hard to memorize. Stories stick.

February 28: Color

Start by either speed drawing or blind contour drawing an image.

Watch this color video

Show the person next to you what you are working on. Talk about the color.

Sign up for individual meeting times before the end of the semester.

Great podcast about color

Homework: Create a github account and make a repo called Portfolio. Start pushing your iterations up to this portfolio.

February 14: Workday

Practice makes perfect.

Since I was unavailable last Thursday, it's a work-in-class day. That way, I can come around and meet with those who I wasn't able to meet with. My apologies.

February 7: Learning to See

Training Your Eyes to See

Anyone who can't communicate their ideas is on the same plane as someone who has no ideas. -- Pericles

ANNOUNCEMENT: There are 4 Tufte books and another book called Dataclysm in the 150 computer lab. You are free to use them to get ideas and learn more about data vis. Please keep them in the room and don't write in them. I recommend starting with The Visual Display of Quantitative Information.

Activities:

  • Watch How to Doodle and doodle in class.
  • Learn how to see by drawing without looking at your paper. This is an exercise called Blind Contour Drawing.
  • Learn what is important in a sketch by drawing it in under a minute. I call this one-minute sketching.

Designers have a saying: get it right in black and white. If it doesn't make sense in black and white, it will make less sense with color and style.

Homework: Have a finished redesign of an ACME lab to show me on Thursday. This is a redesign of the solutions file and plots that you turned in to complete the lab, NOT a redesign of the written pdf of the lab. I recommend using a jupyter notebook and include some comments. I expect a minimum of an hour for this project. Do not spend more than three hours on this assignment. It is meant to show you how to use basic visual variables in matplotlib to improve a visualization.

January 31: Exploration vs Communication

The greatest value of a picture is when it forces us to notice what we never expected to see. -- John Tukey

Talk about the difference between exploring a data set and communicating an insight in a data set. What are the differences in the visualization methods? What should you focus on in each? What skills or abilities do you need for each? Pros and Cons of both?

  • Tufte Cholera Example
  • Tufte Challenger PowerPoint

Examples of Medium data vis blog post

Be careful about using exploratory analysis to confirm your own beliefs. Some good blog posts by Andrew Gelman at Columbia:

Homework: Sign up for a time to meet with me on Thursday. Bring either some visualizations for your ACME lab or stuff for your Data Report project.

January 24: Visual Variables

ANNOUNCEMENT: Formal class will no longer be held on Thursdays to not conflict with the soft skills seminar. All are encouraged to attend the soft skills seminar. Instead of Thursday meetings as a class, I will be holding personal reviews at a time of your choosing on Thursday. Go to this link to sign up for a 10 minute slot on Thursday. If none of these times work for you, contact me and schedule another time.

With a partner, critique the following visualizations:

Visual Variables

  • position
  • size
  • hue
  • saturation
  • value
  • shape

More about Hue, Saturation, and Value (HSV) here.

Drawing exercise

In a group, draw as many representations of this data set: 25, 13. Yes there are only two numbers. Take two minutes to come up with as many representations as you can. Draw them on paper. What did you learn? Discuss.

Homework: Schedule a Thursday portfolio review with me.

January 19: Concept Visualization

In a group, critique the initial visualizations of your data set. Focus your critiques on what the creator should do next with this visualization (choose better colors, make dots bigger, add more variables, etc).

Concepts can be visualized just like data. Discuss in a group what the following visualizations do to represent a concept:

Explained Visually

Setosa Visualizations

Due next time: choose an ACME lab to redesign and show potential employers.

January 17: Know Your Data

Today we are talking about Dear Data and multidimensional visualizations.

FiveThirtyEight Dear Data

Collecting your own data and visualizing it teaches you a lot about how to represent data effectively. You learn that people make assumptions when they want to collect data to measure something. Your visualizations should reflect the choices and assumptions made during the data collection process.

When you are ready to communicate insight you have found in a data set, always start with paper and pen or pencil. Draw what you want the visualization to look like before you build it.

Due next time: Have a visualization to present.

January 12: Data, Not Visualization

We talked about the Gapminder data vis.

Data is more interesting than the visualization. Find interesting data first, then create a visualization.

Tools you could use:

  • Tableau (free for students)
  • matplotlib
  • D3.js (if you know javascript)
  • Bokeh
  • Processing

I added some files in the repo to help you load your data set in Pandas (python).

Due next time: Get your hands on a data set you want to visualize.

January 10: Introduction

We learned how to critique a data visualization.

A Visual Introduction to Machine Learning

Some questions to ask when evaluating representations of data:

  • What are your first impressions?
  • Do you like the visualization?
  • Does it help you understand the data?
  • What is the creator trying to demonstrate?
  • What is being compared?
  • How many variables are present?
  • Is there anything unnecessary or redundant?
  • Where does the data come from?
  • Is the data presented in context?

How to learn data visualization

Other

Cool Wind

Feltron