This dataset is a modified version of the California Housing dataset, built using the 1990 California census data. It contains one row per census block group.
The dataset contains information on 9 variables collected from all the block groups in California in the 1990 Census. The dependent variable is the median house value.
This dataset is used in the book "Hands-On Machine Learning" by Aurélien Géron to demonstrate a sample end-to-end machine learning project workflow. https://github.com/ageron/handson-ml2/tree/master/datasets/housing
This repository contains machine learning projects using the California Housing dataset. The projects include:
The DataFrame has 10 columns and 20640 rows. The columns are:
- longitude: The longitude of the property.
- latitude: The latitude of the property.
- housing_median_age: The median age of the housing units in the census block.
- total_rooms: The total number of rooms in the census block.
- total_bedrooms: The total number of bedrooms in the census block.
- population: The population of the census block.
- households: The number of households in the census block.
- median_income: The median income of households in the census block.
- median_house_value: The median value of housing units in the census block.
- ocean_proximity: The proximity of the property to the ocean (categorical variable).