-
This project is based on Hadoop and Hive.
If you don't have set them up, please reference the following instructions: Hadoop & Hive. -
You can download our testing data from here.
TheREADME.txt
has very detailed explaination about the property of their data. We also has descriptions in our wirteup. Please replace the::
in the given data set by -
Please change the directory to the data file you just download.
For me, it is$cd ~/ml-1m
Please start the hadoop, which is a prerequest for Hive running.
Type the command$hive -f extract.q
.
A directory "result" would appear. It stores the data we want to use. We have already provided such extracted data, callednew_data.txt
in the source file. -
Create the folder on the HDFS, we will put the data into the folder:
$ hadoop fs -makedir /hadoop
Put the data on the HDFS:
$ hadoop -fs copyFromLocal /directory of the data/ /hadoop
For me, it’s$hadoop -fs copyFromLocal ~/new_data.txt /hadoop
Run the jar code:
$ hadoop jar ./Bayes.jar hw6.MultiMovieRecommender /hadoop/ /hadoop/temp /hadoop/output/
Check the result of training data
$hadoop fs -cat /hadoop/output/part*
-
Make sure the output of training data is in the directory of
/movie
in the HDFS.
Make sure thatBayesHiveUDF.jar
is in your current directory.
Run the command$ hive -f constructtrain.q
Run the command$ hive -f classification.q
When we want to change the parameters, we can just simply change theline 11
of classification.q
In the directoryresult/finalresult
, the reuslt of recommendation is generated.
Our sample text result is in the source directory:test_result.txt
Thanks for your using.
If you have any questions, feel free to contact us.
Email: cwang107@jhu.edu