This is a classification problem need to be solve using logistic regression we create in pyspark
- we can take data on https://www.kaggle.com/code/pynirav/hr-logistic-regression/data on this link a data about HR in this data there colums regarding like sales ,salary ,bouns ,average time spend in compnay ,average_montly_hours',
'time_spend_company',
'Work_accident',
'left',
'promotion_last_5years',
'sales'
with help of this data we find logistic regression
This project is about a classification-problem using-logistic-regression help HR data we use logagistic regression to prdict the acurracy on the left colum of data set
pyspark VectorAssembler,StringIndexer BinaryClassification
The final result or accuracy 0.8132792648006996