/BigDataMining-Analysis

movie recommendation

Primary LanguagePython

Project for Big Data Mining and Analysis Course

Movie Rating Prediction Project

  • Dataset

    • Download from http://files.grouplens.org/datasets/ movielens/ml-1m.zip.
    • The data set contains 1 million ratings from 6000 users on 4000 movies.
    • We further sort each ratings by timestamp
  • Recommendations:

    • Step 1:
      • Baseline estimator: use the formular bxi = μ + bx + bi on pdf
    • Step 2:
      • Neighborhood estimator: use the neighborhood approach to predict rating score
        • item-based similarity
        • user-based similarity
    • Incorporating Temporal Dynamics
  • K-mean Clustering

    • Use the k-mean algorithm to cluster the users based on their rating scores given by the file ratings.dat.
  • SVD Dimensionality Reduction

    • Use the SVD algorithm to reduce the dimensionality
  • Metrics

    • the value of RMSE
  • Project Implementation