/tweepy-congress-collector

An example of collecting tweets from U.S. Congressmember Twitter accounts

Primary LanguageJupyter Notebook

Summary: Fetching, wrangling, analyzing, and visualizing Twitter and Congress data with Python 3.5, tweepy, pandas, and matplotlib.

Twitter and Congress Mashup

This repo contains data and several walkthroughs for fetching, wrangling, and visualizing the data, as a means to practice general Python programming as well as learn a bit of pandas and matplotlib.

About the data

The data comes from two sources:

Programming environment requirements

This code was written and tested using the Python 3.5.0 installation provided by Anaconda. I try to use as few non-standard libraries as possible, but in general, Anaconda creates an environment with has just about everything you'd need, including python-dateutil

If you plan on trying to fetch the data for yourself and following my fetch-code to the letter, you'll need to install tweepy on your own.

Lesson manifest

  • Fetching the data - how did the data in data/twitter show up in the repo? Not by magic, but by using the Twitter API and mashing it with crowdsourced Congress data. Note: you don't actually have to do these steps to get data; this repo comes packaged with all the fetched data so that you can focus on the wrangling and visualization.
  • Wrangling the Twitter profiles - The data structure of a Twitter user profile, as Twitter's API provides it, is pretty complicated. Complicated enough that it needs to be serialized as a nested JSON, which makes it hard to throw all the data in data/twitter/profiles into a spreadsheet for easy comparison. So let's make our own data file by picking the interesting data points from each Twitter profile and saving as a flat, easy-to-use CSV data/wrangled/congress-twitter-profiles.csv
  • Wrangling the Twitter tweets - Same deal as above, except not as lengthy of a walkthrough.
  • Analyzing the wrangled data with pandas - with the data in convenient-to-read CSV files, let's use pandas to do some data analysis.
  • Visualizing the wrangled data - When you've spent time to think through the structure of data and how to organize it, visualizations become very easy to produce.