##Highlights
- input => seed url, output => word stats/rank from processed crawled web pages
- Crawlers - BFS, BlockingQueue, Multi-threaded
- URL filtering - Bloom Filter (TODO)
- Page filtering - SimHash (TODO)
- Information retrieval - Tag/Token counts
- Word stats/rank - Zipf's law