/AR2T

Mirror of gitlab. A data mapper for Amazon Reviews to Tweets leveraging Scala and Apache Spark

Primary LanguageScalaGNU Affero General Public License v3.0AGPL-3.0

AR2T

AmazonReviews2Tweets is designed to extract tweets related to amazon reviews. It is designed to parse amazon review datasets and fetch tweets related to the products. Our overall motivation for this is to integrate 3rd party data into recommender systems.

See src/main/resources/application.conf for configuration settings. In the future I will be releasing a dockerized version of this in the future.

Technologies Used

  • Scala, along with some Java references for libraries that I couldn't find good Scala counterparts for
  • Apache Spark
  • Twitter4s, an excellent twitter API wrapper written in Scala.
  • (upcoming) Postgres+JDBC

Twitter API Integration

As per Twitter4s's documentation, expose the following environment variables at runtime:

export TWITTER_CONSUMER_TOKEN_KEY='my-consumer-key'
export TWITTER_CONSUMER_TOKEN_SECRET='my-consumer-secret'
export TWITTER_ACCESS_TOKEN_KEY='my-access-key'
export TWITTER_ACCESS_TOKEN_SECRET='my-access-secret'