This is an assignment on the collecting data course. I try to explore how I can analyze data and visualize the results by pandas on Jupyter Note book.
This dataset is from Google Spreadsheet, which is a large and comprehensive inventory of items in Animal Crossing: New Horizons.
It contains villagers in this game and their features, such as name, personality, birthday species, gender, hobby, style, etc.
There are 391 rows and 17 columns in this dataset.
All the code are on Jupyter Notebook.
The language is python3.
The tool is Pandas.
I explored 3 categories of questions in this analysis.
Question 1 What are the data shapes?
examples:
the row and the column numbers;
display the dataframe;
retrieve the random data samples;
display datatypes;
Question 2 What are correlations between the hobby of villagers and their gender?
examples:
a. Explore how many male and female villagers like fitness:
Which villager loves fitness in their daily life?
In these villagers, which are male, which are female?
How many male villagers like fitness? How many female villagers like fitness?
b. Explore the gender distribution in all hobbies
Count the hobby types,the gender numbers in each kind of hobby,and the percentage of gender numbers in each hobby,
Compare the count of males and females in each hobby category, and the percentage of gender numbers in each hobby,
Visualize the data by a bar chart and a heat map.
Question 3 How do species of villagers distribute?
examples:
Filter the data of species of villagers, then visualize the data by a bar chart.