This repository contains a Jupyter Notebook implementing a regression analysis and exploratory data analysis (EDA) for predicting house prices. The analysis includes the following steps:
Data Loading and Preprocessing:
Loading house price data from a CSV file. Cleaning and organizing the data using Pandas. Feature Engineering and Imputation:
Handling missing values and imputing using the mean strategy. Feature engineering, such as polynomial transformation for feature expansion. Data Visualization:
Exploratory data analysis using histograms and bar plots for selected columns. Scatter plots with custom legends to visualize relationships between different distances. Data Scaling and Standardization:
Scaling features using StandardScaler. Correlation Analysis:
Computing and visualizing the correlation matrix. Regression Modeling:
Implementing Ridge Regression with cross-validated alpha selection. Evaluating model performance using mean absolute error (MAE). Model Serialization:
Saving the best-performing Ridge Regression model, including alpha, weights, bias, and MAE, using Pickle.