/divith-raju-pipeline-hadoop-pyspark

This project presents a comprehensive data pipeline designed to predict customer churn using historical customer data. By leveraging Hadoop and PySpark, this pipeline efficiently processes large datasets, performs feature engineering, and trains a machine learning model to identify customers at risk of leaving.

Primary LanguagePython

Watchers