One of the key challenges in e-commerce is accurately calculating ratings for products based on post-purchase reviews. Solving this problem can lead to increased customer satisfaction on the e-commerce platform, product visibility for sellers, and a seamless shopping experience for buyers. Another challenge is sorting reviews for products accurately. The prominence of misleading reviews can directly impact product sales, resulting in financial losses and customer attrition. By addressing these two fundamental problems, e-commerce platforms and sellers can boost their sales while providing customers with a hassle-free shopping journey.
This dataset contains Amazon product data, including various metadata related to product categories. It focuses on the electronics category and includes user ratings and reviews for the most reviewed product.
Variables:
reviewerID
: User IDasin
: Product IDreviewerName
: User Namehelpful
: Helpful rating scorereviewText
: Review textoverall
: Product ratingsummary
: Review summaryunixReviewTime
: Review time (UNIX timestamp)reviewTime
: Review time (raw)day_diff
: Number of days since the reviewhelpful_yes
: Number of users who found the review helpfultotal_vote
: Total number of votes for the review
Task 1: Calculate Weighted Average Rating Based on Recent Reviews and Compare with the Existing Average Rating
In this task, our goal is to evaluate the given ratings by assigning weights based on the review dates. We need to compare the initial average rating with the weighted average rating obtained.
Import the necessary libraries and read the dataset.
Convert the "reviewTime" column to datetime format and calculate the weighted average ratings for different time intervals.
Create the "helpful_no" variable by subtracting the "helpful_yes" count from the "total_vote" count.
Step 2: Calculate the "score_pos_neg_diff," "score_average_rating," and "wilson_lower_bound" Scores and Add them to the Dataset
Calculate the scores and add them as new columns.
Sort the dataset by the "wilson_lower_bound" score in descending order and select the top 20 reviews.
This will provide the top 20 reviews based on the Wilson Lower Bound score, which can be displayed on the product detail page.