/asteroid-impact-predictions

Group Project using Supervised Learning and Neural Network Models

Primary LanguageJupyter Notebook

DU - University of Denver
Data Analysis & Visualization Bootcamp


Group Project 4:
Asteroid Impact Prediction using
Supervised Learning and Neural Network Models

By M. Aparisio, H. Heer, M. Smith & L. Vara


ReadmeImage_wText

Note: It is important that if you are going to use this code, all files are placed in a directory that matches this repository, for the better functionality of it. Otherwise you would have to adjust the paths on the files, accordingly.

This repository consists of a team project where we explore the predictability of asteroid impacts using machine learning models.


INDEX

  1. Content of the repository
  2. Instructions for the Project
  3. References

Content of the repository

  • Original_Datasets:
    • impacts.csv
    • orbits.csv
  • h5_Files
    • Asteroid_Impact_Model.h5
    • Asteroid_Impact_Optimization_Model.h5
  • Asteroid definitions.pptx <-- Powerpoint presentation Intro to the project and definitions of the different columns in the dataset.
  • Asteroid_Predictions.ipynb <-- File started in Jupyter Notebook for data cleanup prior to Neural Network ML training
  • Asteroid_Predictions_Colab.ipynb <-- File worked on via Google Colab after cleanup, to train our Neural Network ML model prior to optimization.
  • Asteroid_Predictions_Optimization_Colab.ipynb <-- File worked on via Google Colab. Optimized version after training our Neural Network ML model.
  • asteroid-impact-prediction-SL-CFM.ipynb <-- File worked on via Jupyter Notebook for Supervised Learning, with unbalanced data.
  • asteroid_impact-prediction-SL-OverSample.ipynb <-- File worked on via Jupyter Notebook for SL, with OverSampling of the data.
  • asteroid-impact-prediction-SL-UnderSample.ipynb <-- File worked on via Jupyter Notebook for SL, with UnderSampling of the data.
  • cleaned_Asteroid_orbit.csv <-- csv file created via Jupyter Notebook after data was cleaned prior to creating the ML models (NN model version)

Guide to the Project

Guidelines for the Project

  1. Collaborating with our team to pool knowledge and share ideas
  2. Outline a scope and purpose for our project, utilzing our machine learning skills to analyze,solve, or visualize our findings
  3. Finding reliable data to use for our project, being mindful of copyrights, licenses, or terms of use
  4. Track all processes in Jupyter Notebook used for cleanup, and techniques used for Data Analysis
  5. Present our findings to the class on Presentation Day, with each member of our group taking turns in speaking
  6. Submit the URL of our GitHub repository to be graded
  7. Graduate and attain employment from utilizing our knowlwdge acquired from this class

Processes used

  1. Reading the csv files
  2. Cleaning the data
  3. Normalize and stabalize the data
  4. Splitting the data
  5. Training the Machine Learning models
  6. Neural Network model implementation
  7. Created a different Jupyter notebook with the same cleanup process to test Supervised Learning model
  8. Supervised Learning model implementation
  9. Confusion Matrix and Visualization
  10. Compared observations and searched for improved accuracy for each model.

Accuracy for the Neural Network Model (Pre-optimization and Optimized results)

NN_model_AccuracyComparison

Accuracy for the Supervised Learning Model

  1. Low precision and recall due to imbalance of data classes

SL_model_Unbalanced

  1. Results when over sampling the data

SL_model_OverSampling

  1. Results when under sampling the data

SL_model_UnderSampling


References

References for the data source(s):

References for the column definitions:

References for code:

Image Resources: