This project analyzes a Nobel Prize dataset using Python and data analysis libraries. It explores the distribution of winners by category and country, examines the proportion of female winners over time, investigates the age of winners when they received the prize, and identifies the oldest and youngest recipients. The code showcases data manipulation, grouping, filtering, and visualization techniques.
The Nobel Prize is a prestigious international award given annually in several categories, including Physics, Chemistry, Medicine, Literature, Peace, and Economic Sciences. This project aims to analyze a Nobel Prize dataset to gain insights into the demographics and trends of Nobel Prize winners. It involves data exploration, manipulation, and visualization using Python and popular data analysis libraries.
- Clone the repository:
git clone https://github.com/your-username/nobel-prize-data-analysis.git
- Navigate to the project directory:
cd nobel-prize-data-analysis
- Install the required dependencies:
pip install -r requirements.txt
- Open the Jupyter Notebook:
jupyter notebook
- Open the
Nobel_Prize_Data_Analysis.ipynb
notebook. - Follow the instructions in the notebook to run the code and analyze the Nobel Prize dataset.
The dataset used in this project is the Nobel Prize dataset, which contains information about Nobel Prize laureates from 1901 to 2016. The dataset includes details such as the category of the prize, the name of the laureate, their birth date, birth city, country, and more.
The project involves the following steps:
- Data loading and preprocessing: The Nobel Prize dataset is loaded into a Pandas DataFrame. Data preprocessing steps may include handling missing values, data type conversions, and feature engineering.
- Data exploration: The dataset is explored to understand its structure and contents. Summary statistics and visualizations are used to gain insights into the distribution of winners by category, country, gender, and age.
- Data analysis: Various data analysis techniques are applied