hadoop-wordcount-eg

Hadoop WordCount example - Maven project

Hadoop with Filesystem as Input and Output

  1. Import the project in Eclipse using "Existing Maven Projects"
  2. Locate class WordCountFileSystem
  3. Change the inputPath to the path value of the directory path with the files you want to analyze
  4. Run

Hadoop with Filesystem as Input and Cassandra as Output

  1. Import the project in Eclipse using "Existing Maven Projects"
  2. Locate class WordCountCassandraOutput
  3. Change the INPUT_DIR_PATH to the path value of the directory path with the files you want to analyze
  4. Download Cassandra and run it locally
  5. Create Cassandra keyspace with the name cql3_worldcount
  6. Create Cassandra columnfamily in the keyspace created in 5. with the following: CREATE TABLE cql3_worldcount.output_words ( word text PRIMARY KEY, count_num text )
  7. Run