Dataset: https://www.kaggle.com/datasets/arashnic/book-recommendation-dataset
Approach:
- I first filtered the data by considering only those users who have rated more than 200 books and only those books that have received more than 50 ratings.
- This step helped in reducing the noise and sparsity in the data by focusing on users and books that have a significant number of ratings.
- Next, calculated the similarity scores between the books using the Euclidean distance similarity measure.
- This approach calculates the distance between the ratings of two books, which helps in identifying the books that are similar in terms of their ratings. Finally, based on the similarity scores, generated recommendations for a selected book. By considering the books with the highest similarity scores to the selected book, were able to recommend other books that users with similar tastes might also enjoy.