Hadoop tools for use with netarchivesuite

Originally forked from git://github.com/statsbiblioteket-hadoop-studygroup/Hadoop-Word-Count.git