Here you can find a PySpark project I did at university, which performs an ETL from some data in csv format in order to prepare it for prediction. In particular, the data corresponds to air-flights and the task is to predict, given an aircraft, if it is going to have an unscheduled maintenance.
alvaro-budria/Predictive-Analytics-Using-PySpark
Here you can find a small PySpark project that performs an ETL from some data in csv format in order to prepare it for prediction. In particular, the data corresponds to air-flights and the task is to predict, given an aircraft, if it is going to have an unscheduled maintenance.
PythonApache-2.0