/Amazon_Vine_Analysis

The Amazon Vine program is a service that allows manufacturers and publishers to receive reviews for their products. We had access to approximately 50 datasets. Each one contains reviews of a specific product, from clothing apparel to wireless products. We picked one of these datasets, video game. We used PySpark to perform the ETL process to extract the dataset, transformed the data, connected to an AWS RDS instance, and loaded the transformed data into pgAdmin. Next, we used Pandas to determine if there is any bias toward favorable reviews from Vine members in your dataset. We summarized of the analysis for Jennifer to submit to the SellBy stakeholders.

Primary LanguageJupyter Notebook

Amazon_Vine_Analysis:

The Amazon Vine program is a service that allows manufacturers and publishers to receive reviews for their products. We had access to approximately 50 datasets. Each one contains reviews of a specific product, from clothing apparel to wireless products. We picked one of these datasets, video game. We used PySpark to perform the ETL process to extract the dataset, transformed the data, connected to an AWS RDS instance, and loaded the transformed data into pgAdmin. Next, we used Pandas to determine if there is any bias toward favorable reviews from Vine members in your dataset. We summarized of the analysis for Jennifer to submit to the SellBy stakeholders.

Resources used:

Data source: Amazon review dataset click for link vinereview dataset click for link Request access for colab press here

Results:

-Total Vine number is 94. -Total 5 stars vine number is 48. -Percentages of 5 stars reviews is ~51.6% Screen Shot 2022-02-28 at 9 55 28 PM

-Total no-Vine number is 40471. -Total 5 stars non paid vine number is 15663. -Percentages of 5 stars no-vine reviews is ~38.7% Screen Shot 2022-02-28 at 9 55 36 PM

-Total number of vines is 40565 Screen Shot 2022-02-28 at 9 55 43 PM

summary:

We used Pandas to determine if there is any bias towards reviews that were written as part of the Vine program. For this analysis, we determined if having a paid Vine review makes a difference in the percentage of 5-star reviews.We can see that there is significant difference between vine and no vine reviews which are 51% and 39%, which shows that vine members are bias. We could have more statistical analysis like mean, med, mode to come up with better result.