Group Project 4:
Asteroid Impact Prediction using
Supervised Learning and Neural Network Models
By M. Aparisio, H. Heer, M. Smith & L. Vara
Note: It is important that if you are going to use this code, all files are placed in a directory that matches this repository, for the better functionality of it. Otherwise you would have to adjust the paths on the files, accordingly.
This repository consists of a team project where we explore the predictability of asteroid impacts using machine learning models.
- Content of the repository
- Instructions for the Project
- References
- Original_Datasets:
- impacts.csv
- orbits.csv
- h5_Files
- Asteroid_Impact_Model.h5
- Asteroid_Impact_Optimization_Model.h5
- Asteroid definitions.pptx <-- Powerpoint presentation Intro to the project and definitions of the different columns in the dataset.
- Asteroid_Predictions.ipynb <-- File started in Jupyter Notebook for data cleanup prior to Neural Network ML training
- Asteroid_Predictions_Colab.ipynb <-- File worked on via Google Colab after cleanup, to train our Neural Network ML model prior to optimization.
- Asteroid_Predictions_Optimization_Colab.ipynb <-- File worked on via Google Colab. Optimized version after training our Neural Network ML model.
- asteroid-impact-prediction-SL-CFM.ipynb <-- File worked on via Jupyter Notebook for Supervised Learning, with unbalanced data.
- asteroid_impact-prediction-SL-OverSample.ipynb <-- File worked on via Jupyter Notebook for SL, with OverSampling of the data.
- asteroid-impact-prediction-SL-UnderSample.ipynb <-- File worked on via Jupyter Notebook for SL, with UnderSampling of the data.
- cleaned_Asteroid_orbit.csv <-- csv file created via Jupyter Notebook after data was cleaned prior to creating the ML models (NN model version)
- Collaborating with our team to pool knowledge and share ideas
- Outline a scope and purpose for our project, utilzing our machine learning skills to analyze,solve, or visualize our findings
- Finding reliable data to use for our project, being mindful of copyrights, licenses, or terms of use
- Track all processes in Jupyter Notebook used for cleanup, and techniques used for Data Analysis
- Present our findings to the class on Presentation Day, with each member of our group taking turns in speaking
- Submit the URL of our GitHub repository to be graded
- Graduate and attain employment from utilizing our knowlwdge acquired from this class
- Reading the csv files
- Cleaning the data
- Normalize and stabalize the data
- Splitting the data
- Training the Machine Learning models
- Neural Network model implementation
- Created a different Jupyter notebook with the same cleanup process to test Supervised Learning model
- Supervised Learning model implementation
- Confusion Matrix and Visualization
- Compared observations and searched for improved accuracy for each model.
- Low precision and recall due to imbalance of data classes
- Results when over sampling the data
- Results when under sampling the data
References for the data source(s):
- Datasets for this project: https://www.kaggle.com/datasets/nasa/asteroid-impacts
References for the column definitions:
- https://cneos.jpl.nasa.gov/about/neo_groups.html#:~:text=The%20vast%20majority%20of%20NEOs,%2Dmajor%20axes%20(%20a%20).
- https://howthingsfly.si.edu/ask-an-explainer/what-orbit-eccentricity
- https://en.wikipedia.org/wiki/Orbital_inclination
- https://astronomy.swin.edu.au/cosmos/A/Argument+Of+Perihelion
- https://cneos.jpl.nasa.gov/glossary/
- https://www.britannica.com/science/mean-anomaly
- https://en.wikipedia.org/wiki/Minimum_orbit_intersection_distance#:~:text=Minimum%20orbit%20intersection%20distance%20(MOID,collision%20risks%20between%20astronomical%20objects.
References for code:
-
Uploading a CSV file to Google Colab:
-
Using the strip() method for white spaces:
-
Confusion Matrix Visualization:
-
Using Keras for Machine Learning:
-
Learning Rate Scheduler:
- https://machinelearningmastery.com/using-learning-rate-schedules-deep-learning-models-python-keras/
- https://keras.io/api/callbacks/learning_rate_scheduler/
- https://d2l.ai/chapter_optimization/lr-scheduler.html
- https://stackoverflow.com/questions/61981929/how-to-change-the-learning-rate-based-on-the-previous-epoch-accuracy-using-keras
- https://neptune.ai/blog/how-to-choose-a-learning-rate-scheduler
-
Validation_Split function:
-
Activation Functions:
-
Optimizers:
-
Callbacks:
-
Saving and Loading Models:
- https://colab.research.google.com/github/agungsantoso/deep-learning-v2-pytorch/blob/master/intro-to-pytorch/Part%206%20-%20Saving%20and%20Loading%20Models.ipynb
- https://colab.research.google.com/github/tensorflow/docs/blob/master/site/en/tutorials/keras/save_and_load.ipynb
- https://stackoverflow.com/questions/64808087/how-do-i-save-files-from-google-colab-to-google-drive
- https://stackoverflow.com/questions/46986398/import-data-into-google-colaboratory
Image Resources:
-
ReadMe image was taken from:
-
Introduction and Definition of Features in the DataSet Slide images
- https://pixabay.com/illustrations/asteroid-space-stars-meteor-1477065/
- https://pixabay.com/illustrations/armageddon-apocalypse-earth-2104385/
- https://en.wikipedia.org/wiki/Orbital_eccentricity
- https://www.sciencedirect.com/topics/physics-and-astronomy/true-anomaly
- https://www.researchgate.net/figure/Minimum-Orbital-Intersection-Distance_fig7_36174303
- https://pixabay.com/illustrations/asteroid-planet-land-space-span-4376113/