The federal government recently shut down. Let's look what their finances have looked like during the shutdown.
This project is spread across two days, plus the usual next day for presentations and code review.
Lectures will be organized as follows.
- Day 1: The grammar of graphics and ggplot2
- Day 2: More data visualization theory and more about R
- Day 3: Code review
treasury.io provides a daily feed of deposits and withdrawals from different accounts within the federal treasury. You can download the full historical feed as a SQLite3 database.
wget http://api.treasury.io/cc7znvq/47d80ae900e04f2/http/treasury_data.db
Here's a codebook.
- Learn some data visualization theory
- Learn how to make plots in ggplot2
- Learn enough R to make plots in ggplot2
Select four visuals that someone else created.
- A good data visual
- A bad data visual
- A good non-data visual
- A bad non-data visual
For each visual, write a paragraph about why you chose it.
Save these in visual-critques.md, and submit them via pull request by day 2 of this project. Choose one of these four visuals to present to the class; we'll start day 2 with these presentations.
Time looked at how the federal government's finances have been doing for the past two weeks. Their analysis is pretty cool, but you can do better.
Explore the data, figure out something interesting about the government shutdown, and explain this in a blog post that includes some explanatory plots.
During lecture on day 1, we'll show you how to use the R library ggplot2, but we won't talk too much about how R works in general until day 2.
Base your code on boilerplate.r
so that you don't have to know how R works.
Look at the four visuals that you selected before. Explain how well each one follows the guidelines and theory that we discussed in lecture during this project. Write a paragraph for each visual.
We want you to apply some data visualization theory and to make some plots with ggplot2. Once you feel that you have done this, you may move on to other things. Here are some ideas.
- Do a more involved analysis in Python, and make plots in matplotlib.
- R has a lot of standard statistical methods build-in. If you already understand how these works, use R to build models about what is going on.
Watch Engineering Data Analysis before day 1 of the project.
- The Grammar of Graphics, by Leland Wilkinson
- ggplot2 documuentation
- Summary of Timothy Samara's Design Elements.
- Design Elements, by Timothy Samara
- R Spells for Data Wizards
- Intro to R videos
- R-bloggers