This repository contains an analysis of the student results dataset. The dataset provides information about students' demographics, parental education, test preparation, and scores in various subjects.
To run the analysis, ensure that the required libraries (numpy, pandas, matplotlib, seaborn) are installed. The student results dataset should be saved as a CSV file named "Student_results.csv" in the same directory.
Execute the code in a Python environment such as Jupyter Notebook or any Python IDE. The results will be displayed through various charts and summaries.
Feel free to modify the code and adapt it to your own dataset or add additional analyses as needed.
The dataset used for this analysis is stored in the Student_results.csv
file. It contains the following columns:
Gender
: Gender of the studentEthnicGroup
: Ethnic group of the studentParentEduc
: Education level of the student's parentLunchType
: Type of lunch the student hasTestPrep
: Test preparation statusParentMaritalStatus
: Marital status of the student's parentPracticeSport
: Sports practice statusIsFirstChild
: Indicates if the student is the first child in the familyNrSiblings
: Number of siblings the student hasTransportMeans
: Mode of transportation used by the studentWklyStudyHours
: Number of weekly study hoursMathScore
: Score in Math subjectReadingScore
: Score in Reading subjectWritingScore
: Score in Writing subject
Before diving into the analysis, let's get some insights into the dataset by loading and examining the data:
import pandas as pd
df = pd.read_csv("Student_results.csv")
print(df.head())
print(df.describe())
print(df.info())
print(df.isnull().sum())
- The analysis reveals the distribution of students based on gender. It shows that there are more females than males in the dataset.
- The analysis examines the relationship between parent education and student scores. It suggests that parent education has a positive impact on their child's education (scores).
- The analysis explores the impact of parent marital status on student scores. It suggests that there is no significant effect based on parent marital status.
- The analysis focuses on the distribution of scores in Math, Reading, and Writing subjects. The box plots and violin plots provide insights into the spread and central tendency of the scores. Math appears to be relatively more challenging compared to the other subjects.
- The analysis visualizes the distribution of students across different ethnic groups. It gives an overview of the representation of each group in the dataset.
- The analysis compares the average scores of students based on their test preparation status. It shows that students who completed test preparation tend to have higher average scores compared to those who did not.
- The analysis investigates the relationship between lunch type and student scores. It demonstrates that students who have a standard lunch tend to have higher average scores compared to those with a free/reduced lunch.
- The analysis examines the impact of having siblings on student scores. It suggests that there is no significant effect based on whether the student is the first child or has siblings.
- The analysis explores the relationship between sports practice and student scores. It indicates that students who regularly practice sports tend to have better average scores compared to those who do not participate in sports.
- The analysis investigates the impact of transportation means on student scores. It shows that the mode of transportation does not have a significant effect on average scores.
The analysis of the student results dataset leads to the following conclusions:
- There are more females than males in the dataset.
- Higher levels of parent education are associated with higher average scores.
- There is no significant effect based on parent marital status.
- Math appears to be relatively more challenging compared to other subjects.
- Ethnic Group A has the lowest average scores, while Ethnic Group E has the highest average scores.
- Test preparation leads to higher average scores.
- Having a standard lunch is associated with higher average scores.
- Regular sports practice leads to better average scores.
- The mode of transportation does not show a significant impact on student average scores.
For detailed code and analysis, please refer to the Jupyter Notebook or Python script in this repository.