Code Summany

The code for this project contain about two part: one is for analyzing historical data, and other is for real time data.

Data File

when we create some data we want to analyze, we put them into data file

Historical Analyse

for this part, we have 5 directories:

  1. anaylyze&draw: When we get data from hive, we use the scripts in this directory to further analyse and do data visualizaiton.
  2. clean: The scripts in this directory are that we use to get extra information which is not in original data and further clean that data.
  3. get_data_from_hive: This directory includes some hive and shell script to get data which we want from original data.
  4. graph: This directory include all HTML files which include D3.js to get visualize data.(note the files are not the same as the files in anaylyze&draw).
  5. animation: This directory include Processing code to make animation for NYC check-in on 10/09/2012.

Realtime Analyse

for this part, we have 2 directories:

  1. heatmap: This directory include code for creating check-in heatmap for realtime data.
  2. real_time_frequency: This file include code for creating every two second frequency for realtime data.