Compile and Run Java MapReduce Programs on Virtual Box

  1. Startup your VM 2. Write your driver source code using a text editor like vi (or emacs):
      vi MaxTemperature.java
  1.  Write your mapper and reducer source code:
      vi MaxTemperatureMapper.java
      vi MaxTemperatureReducer.java
  1. Compile your Java code:
java -version
yarn classpath
javac -classpath `yarn classpath` -d . MaxTemperatureMapper.java
javac -classpath `yarn classpath` -d . MaxTemperatureReducer.java
javac -classpath `yarn classpath`:. -d . MaxTemperature.java
  1. Create your jar file
jar -cvf maxTemp.jar *.class
  1. Create your input data file on the local file system
vi temperatureInputs.txt
  1. Put your input data file into HDFS
hdfs dfs -ls /
hdfs dfs -ls /user
hdfs dfs -ls /user/cloudera
hdfs dfs -mkdir /user/cloudera/class1
hdfs dfs -put temperatureInputs.txt /user/cloudera/class1
hdfs dfs -cat /user/cloudera/class1/temperatureInputs.txt
  1. Run your MapReduce program
hadoop jar maxTemp.jar MaxTemperature /user/cloudera/class1/temperatureInputs.txt /user/cloudera/class1/output
  1. Verify that the program ran and the results are correct
hdfs dfs -ls /user/cloudera/class1/output
hdfs dfs -cat /user/cloudera/class1/output/part-r-00000