Checklist
Cleaning and Processing of Data
-
Import libraries
-
Load dataset
-
Check for missing values
-
Drop unnecessary features
-
Rename features
-
Reorder featuress
-
Change data types
- Change
flight_date
to "datetime" in yyyy-mm-dd format - Change
flight_dep
andflight_arr
to "datetime" in HH:MM format - Change
total_stops
to "int" from categorical values - Change
flight_time
to "datetime" in HH:MM format - Change
flight_fare
to "int"
- Change
-
Fix reapeated values in some features
- Fix feature
airline_name
, i.e,: "Air Asia" to "AirAsia" - Fix features
flight_dep
andflight_arr
, i.e,: "BOM" to "Mumbai"
- Fix feature
-
Encode features for data Analysis
- Extract months and days from
flight_date
- Extract hours and mins from
dep_time
andarr_time
- Extract hours and mins from
flight_time
- Convert
airline_name
to numerical data - Connvert
flight_dep
to numerical data - Convert
flight_arr
to numerical data
- Extract months and days from
Exploratory Data Analysis
-
Considering simplifying features name i.e.:
flight_fare
toprice
, etc.- Change flight_fare to price
- Change flight_time to duration
- Change airline_name to airline
- Change flight_dep to dep
- Change flight_arr to arr
-
Create test data
- Split data
- Feauture selection for prediction
-
Hypertune Model
- Using Random Search Cross-Validation
- Using Grid Search Cross-Validaton
Price Prediction using Machine Learning