hadoop program to perform a wordcount of documentation
- graph runtime with varying size datasets
- Update WordCount to ignore punctuation and html script
- Count average use of everyword per year
- compute max and min a word appears in all addresses
- Compute avg and std of words in 4 year windows from 1985 // can use input | mergestuff(files)
- In post window years, find words whose use was greater than 2 stds (e.g. 89)