Machine Learning- Exoplanet Exploration

Background

Over a period of nine years in deep space, the NASA Kepler space telescope has been out on a planet-hunting mission to discover hidden planets outside of our solar system.

To help process this data, we need to create machine learning models capable of classifying candidate exoplanets from the raw dataset.

Preprocess the raw data
Tune the models
Compare two or more models

Instructions

Preprocess the Data

Preprocess the dataset prior to fitting the model.
Perform feature selection and remove unnecessary features.
Use MinMaxScaler to scale the numerical data.
Separate the data into training and testing data.

Tune Model Parameters

Use GridSearch to tune model parameters.
Tune and compare at least two different classifiers.

Reporting

A comparison of each model's performance as well as a summary about findings and any assumptions based on model (is your model good enough to predict new exoplanets? Why or why not? What would make your model be better at predicting new exoplanets?).

Resources

Considerations

Cleaning the data, removing unnecessary columns, and scaling the data.
Not all variables are significant to remove any insignificant variables.
Make sure your sklearn package is up to date.
Try a simple model first, and then tune the model using GridSearch.
When hyper-parameter tuning, some models have parameters that depend on each other, and certain combinations will not create a valid model. Be sure to read through any warning messages and check the documentation

RUBALBHULLAR/Exoplanet-Exploration