/NASA-Astronauts-Analysis

Explore the captivating world of NASA astronauts through data analysis. Unveil the remarkable stories and achievements of space pioneers. This project uses Python and SQL to clean, analyze, and visualize astronaut data, revealing insights into demographics, careers, and space missions.

NASA Astronauts Analysis Project

Project Overview: This repository contains the code and documentation for an exploratory analysis of NASA astronauts' data. The project delves into the rich tapestry of individual stories and collective triumphs of astronauts who have ventured into space. By leveraging data analysis techniques, we aim to unveil insights into astronauts' demographics, careers, and accomplishments, shedding light on their remarkable journeys.

Tools Used:

  • Python: Python was chosen for its versatility in data analysis, data cleaning, and visualization. Libraries such as Pandas, NumPy, and Matplotlib make it a powerful choice.
  • SQL: SQL (Structured Query Language) complements Python by facilitating data manipulation and querying, especially when dealing with large datasets.

Steps Involved:

  1. Data Collection: The first step was to obtain the dataset containing details about NASA astronauts, including their names, birthdates, education, military history, and space missions.

  2. Data Cleaning: Data cleaning involved:

    • Filtering out rows with missing values in the 'Death_Mission' and 'Death_Date' columns, as these rows lacked mission information.
    • Removing rows with missing values in the 'Alma_Mater' column to ensure a complete educational background.
    • Converting 'Birth_Date' and 'Death_Date' columns to datetime format for consistent date handling.
  3. Exploratory Data Analysis (EDA): We employed Python's data analysis tools to perform EDA. Visualizations, descriptive statistics, and data summarization were used to understand astronaut demographics, mission details, and more.

  4. Data Visualization: Visualization played a crucial role in conveying insights. Matplotlib was employed to create informative charts and graphs, making patterns and trends more accessible.

  5. SQL Database Integration: The cleaned dataset was exported and integrated into a MySQL database for more complex querying capabilities.

  6. Advanced Analysis: SQL allowed for complex analysis of the dataset, such as identifying astronauts with specific criteria, determining the distribution of astronauts by various factors, and calculating averages and other metrics.

  7. Insight Generation: Through the combination of Python and SQL, we systematically uncovered key insights about NASA astronauts. These insights include demographics, career trends, military backgrounds, and more.

By following this structured process and employing Python and SQL, this project not only provides valuable insights into NASA astronauts' stories but also serves as a template for data analysis on diverse datasets.