Study-on-Multiple-Linear-Regression-analysis

In this study, data for multilinear regression analysis is taken from SuperCon database. Assumptions of multilinear regression analysis - normality,linearity,no extreme values and missing values were examined.Then a regression model is fit into the data and is evaluated based on adjusted R square(which is less than 0.9). The correlation matrix is plotted and observed that there are many related features.Thus Principal Component Analysis(PCA) and Factor analysis are fit to reduce the dimension but the accuracy proved to be not so good(0.671). To overcome heteroscedasticity( different variability), Box-Cox method is used and this resulted in a good adjusted R square(appx. 0.93) compared to initial data(0.869).