This project uses Apache Spark and MongoDB to develop a predictive maintenance model for detecting equipment failures based on sensor data.
spark_app.py
: The Spark application that builds the predictive model.predictive_maintenance_spark_mlib.ipynb
: Testing the modelload_data.py
: The script to load sensor data into MongoDB.sensor_data.csv
: The CSV file containing sensor data.Dockerfile
: Docker setup for running Spark with MongoDB connector.docker-compose.yml
: Docker Compose file to run Spark and MongoDB containers.
-
Start MongoDB and Spark using Docker Compose:
docker-compose up
-
Load data into MongoDB:
docker run -it --network spark_mongo_network -v $(pwd):/app python:3.8-slim python /app/load_data.py
-
Run the Spark application:
docker exec -it spark1 bash spark-submit /opt/bitnami/spark/app/spark_app.py