/NYC-SALE-PRICE-PREDICTION

NYC SALE PRICE PREDICTION

Primary LanguageJupyter Notebook

NYC SALE PRICE PREDICTION

Goal : For this project, I'm going to build a machine learning models that could predict the property sales price for properties in NYC. I'm dividing my project into the following sections:

  • EDA
  • Data Preparation
  • Modelling

Task 1: Read dataset, and perform basic data exploration. check duplicates, missing values etc.dealing with (missing values, duplicate entries, outliers, etc.)

Identification of variables and data types

For this part of the EDA I'm identifying the type of variables in the dataset. It is imporant for an EDA to know with what kind of variables you are dealing with. There are two types of variables:

  • Categorical
  • Numerical

Missing values and data cleaning

  • For this step of the project, I'm checking for missing values and the missigng correlation and the pattern of the missing values in the datasets.

Task 2: Data exploration using data visualization.

  • Raise two questions that can be answered by performing data visualization.

  • Briefely mention why you think this question would be interesting to whom (who is your audience).

  • Think about the EDA principals.

Task 3: Feature Engineering, transfer (cateogorical features), how We select the important features.

  • If we would like to predict the house sale price.

  • Analyze the scale of each attribute and determine which ones you would transfer (e.g., cateogorical features).

  • Discuss how you plan to select important features.

Task 4: Build many models to predict selling prices

  • Random Forest Model

  • Linear Regression Model

  • KNN Model

  • SVR Model

  • Decision Tree Regressor Model

  • Gradient Boosting Regressor Model

  • Ada Boost Regressor Model

  • XGB Regressor Model

Results

image