EDA-and-prediction-models: A Jupyter Notebook repository from pykira-cpu

Objective:

Congratulations,

We have landed on mars and we are trying to colonize it. But the main problem we are facing is which plants we can grow at different geographical regions so that maximum number of plants would survive.

Fortunately we have the data from earth to give us insights.

Please help us in predicting which kind of plant we can grow at different locations depending on the environment.

It’s time to save humanity which is at the brink of extinction.

You are our only hope for survival.

Problem Statement

Create a report on key insights derived from Exploratory Data Analysis
Create a multi-class prediction model to predict the species of plant which will survive in the neighborhood of a given environment.
Create key segments for all the plants (train + test) based on the average sunlight

received throughout the day and their distance from waterbody to identify which

segments of plants are getting enough sunlight and water vs which ones are not. This will help in mobilizing resources to track growth of trees appropriately

Key Requirements Directions - Hello Challengers,

Required sections in the Jupyter Notebook -

Exploratory Data Analysis
Data Preprocessing

2A. handling outliers ( imputation,Removal )

Data Engineering
Data Preparation for Predictive Modeling
Classification Model Predictions (at least 3 different predictive models) with hyperparameter tuning.
Comparison of model using performance KPIs, Training & Testing Time
Final predictive model recommendation

Dataset

Data size (test and train) : (116203 * 13 and 464809 *13)
Target Variable : Plant_Type
Data dictionary : Shared separately

pykira-cpu/EDA-and-prediction-models

Objective:

Problem Statement

Required sections in the Jupyter Notebook -