California_Housing_Price_Prediction

picture alt

Business Problem:

To predict the prices of houses in Californa based on their different specifications and locations

Description :

The Dataset is built using the 1990 California census data. It contains one row per census block group. A block group is the smallest geographical unit for which the U.S. Census Bureau publishes sample data (a block group typically has a population of 600 to 3,000 people).

The information was collected on the variables using all the block groups in California from the 1990 Census. In this sample a block group on average includes 1425.5 individuals living in a geographically compact area. Naturally, the geographical area included varies inversely with the population density. Distances were computed among the centroids of each block group as measured in latitude and longitude and all the districts reporting zero entries for the independent and dependent variables were excluded. The final data contained 20,640 observations on 9 variables. The dependent variable is ln(median house value). The other variables are as follows:

longitude: A measure of how far west a house is; a higher value is farther west

latitude: A measure of how far north a house is; a higher value is farther north

housing_Median_Age: Median age of a house within a block; a lower number is a newer building

total_Rooms: Total number of rooms within a block

total_Bedrooms: Total number of bedrooms within a block

population: Total number of people residing within a block

households: Total number of households, a group of people residing within a home unit, for a block

median_Income: Median income for households within a block of houses (measured in tens of thousands of US Dollars)

median_House_Value: Median house value for households within a block (measured in US Dollars)

ocean_Proximity: Location of the house w.r.t ocean/sea