This data set contains information from the Ames Assessor’s Office used in computing assessed values for individual residential properties sold in Ames, IA from 2006 to 2010.
We'll use the supervised learning to develop a regression model to predict housing sale price. We'll group the housing by clustering the data.
The dataset has 1451 samples and 80 attributes. 23 nominal, 23 ordinal, 14 discrete, and 20 continuous variables. The SalePrice attribute is the target data, it's a continous value.
A solution could be develop a linear regression model and clustering data into different groups.
A good native benchmark could be the mean or median of the SalePrice.
We can calculate the coefficient of determination, R2 or use MSE (mean square error) to quantify our model’s performance.