Analysing House Prices using Descriptive Statistics

Welcome to the Descriptive Statistics Project! In this project, you will demonstrate what you have learned in this course by conducting an experiment dealing with House Prices.

We have seen in the in-class session how Descriptive Statistics helps us understand the data carefully. We learnt about:

  • Measures of centrality
  • Measures of spread
  • Correlation

You can use any numpy and matplotlib objects for the purpose of this exercise.

Dataset

For this exercise, we will use the House Prices dataset, which we have already discussed in the session. The dataset contains SalePrice of around 1400 houses. The dataset is a part of a larger dataset. You can read about the dataset description here.

Why solve this assignment?

Solving this assignment would help you :-

  • Learn to summarise huge datasets using a small number of parameters

  • Learn how correlation can be an important tool to understand data, but also can mislead to spurious insights

  • For the assignment we will be using the following below packages:

    • matplotlib
    • numpy
    • pandas

By completing this project you have an opportunity to win 250 points!

Let's get started!