These map reduce functions use Common Crawl data to look at the spread of congressional legislation on the internet.
Program Tasks:
- Count on how many pages the bill, in any of its forms, has been mentioned
- Record the domains of pages that mention a bill, in any of its forms, and outputs the 50 domains that have mentioned the bill the most (with their count of pages that have mentioned the bill)
- Output the top 50 words found across all pages that mention a bill in any of its forms, less a set of 100 very common words
These functions are called from the file TotalAnalysis.