/spark-submit-airflow-aws

Python, Spark Submit, Airflow, AWS S3, AWS EMR

Primary LanguagePythonApache License 2.0Apache-2.0

spark-submit-airflow-aws

Tech:

Python, PySpark, Airflow, AWS S3, AWS EMR

Movie review classifier

  1. clean input data
  2. use a pre-trained model to make prediction
  3. write predictions to a HDFS output