/Spotify-Data-Analysis

This project is an analysis of Spotify music data, with the aim of exploring the characteristics of different music genres and artists.

Primary LanguageJupyter Notebook

Spotify Data Analysis


This project is an analysis of Spotify music data, with the aim of exploring the characteristics of different music genres and artists. The project involves several tasks, including downloading a dataset from Kaggle, performing exploratory data analysis (EDA) using pandas, and creating visualizations using matplotlib and seaborn. The visualizations allow users to gain insights into the characteristics of different genres and artists, such as tempo, loudness, and popularity.

User's Manual

Files Description
Spotify data analysis.ipynb The file conatins all analysis part

Tools & Technology Used:

Methodology:

1.Load and explore data: Use Pandas to read the CSV file into a DataFrame and use methods such as head(), info(), and describe() to examine its structure.

2.Clean and preprocess data: Use Pandas methods such as dropna(), fillna(), astype(), rename(), and drop() to handle missing values, convert data types, rename columns, and drop unnecessary columns.

3.Analyze data: Use libraries such as Matplotlib, Seaborn, and Plotly for data visualization and NumPy and SciPy for statistical analysis to explore the data and identify patterns or relationships.

4.Extract insights: Use the results of your analysis to draw conclusions and insights from the data and communicate your findings clearly and effectively using visualizations and clear language.

Results/Insights:

Visualisation with correlation Map

corr_df=tracks.drop(["key","mode","explicit"],axis=1).corr(method="pearson")

plt.figure(figsize=(20,8))

heatmap=sns.heatmap(corr_df,annot=True)

heatmap.set_title("corrlations Heatmap between variable")

Regression plot between loudness and energy

plt.figure(figsize=(12,6))

sns.regplot(data=sample,y="loudness",x="energy",color="c").set(title="Loudness vs energy correlation")

Regression plot between popularity and acousticness

plt.figure(figsize=(12,6))

sns.regplot(data=sample,y="popularity",x="acousticness",color="b").set(title="popularity vs acousticness correlation")

Distribution plot to visualise the total number of songs in each year since 1922 those are available on spotify music app

sns.displot(years,discrete=True,aspect=2,height=5,kind="hist").set(title="Number of songs per year")

Duration of songs over years

total_dr=tracks.duration

fig_dims=(30,15)

fig,ax=plt.subplots(figsize=fig_dims)

fig=sns.barplot(x=years,y=total_dr,ax=ax,errwidth=False).set(title="year vs duration")

plt.xticks(rotation=90)

plt.show()

Line plot to show the average duration of the songs over the years

total_dr=tracks.duration

sns.set_style(style="whitegrid")

fig_dims=(10,5)

fig,ax=plt.subplots(figsize=fig_dims)

fig=sns.lineplot(x=years,y=total_dr,ax=ax).set(title="Year vs Duration")

plt.xticks(rotation=60)

plt.show()