For this project we used AWS eco-system to manage the Amazon reviews data(~50 GB) and do sentiment analysis on the reviews.
Amazon EMR, Amazon Athena, Amazon S3, Pig, Apache Spark.
Parquet, Avro, CSVs, JSON.
For this project we used AWS eco-system to manage the Amazon reviews data(~50 GB) and do sentiment analysis on the reviews.
PigLatin