This project is an analysis of Spotify music data, with the aim of exploring the characteristics of different music genres and artists. The project involves several tasks, including downloading a dataset from Kaggle, performing exploratory data analysis (EDA) using pandas, and creating visualizations using matplotlib and seaborn. The visualizations allow users to gain insights into the characteristics of different genres and artists, such as tempo, loudness, and popularity.
Files | Description |
---|---|
Spotify data analysis.ipynb | The file conatins all analysis part |
1.Load and explore data: Use Pandas to read the CSV file into a DataFrame and use methods such as head(), info(), and describe() to examine its structure.
2.Clean and preprocess data: Use Pandas methods such as dropna(), fillna(), astype(), rename(), and drop() to handle missing values, convert data types, rename columns, and drop unnecessary columns.
3.Analyze data: Use libraries such as Matplotlib, Seaborn, and Plotly for data visualization and NumPy and SciPy for statistical analysis to explore the data and identify patterns or relationships.
4.Extract insights: Use the results of your analysis to draw conclusions and insights from the data and communicate your findings clearly and effectively using visualizations and clear language.
corr_df=tracks.drop(["key","mode","explicit"],axis=1).corr(method="pearson")
plt.figure(figsize=(20,8))
heatmap=sns.heatmap(corr_df,annot=True)
heatmap.set_title("corrlations Heatmap between variable")
plt.figure(figsize=(12,6))
sns.regplot(data=sample,y="loudness",x="energy",color="c").set(title="Loudness vs energy correlation")
plt.figure(figsize=(12,6))
sns.regplot(data=sample,y="popularity",x="acousticness",color="b").set(title="popularity vs acousticness correlation")
Distribution plot to visualise the total number of songs in each year since 1922 those are available on spotify music app
sns.displot(years,discrete=True,aspect=2,height=5,kind="hist").set(title="Number of songs per year")
total_dr=tracks.duration
fig_dims=(30,15)
fig,ax=plt.subplots(figsize=fig_dims)
fig=sns.barplot(x=years,y=total_dr,ax=ax,errwidth=False).set(title="year vs duration")
plt.xticks(rotation=90)
plt.show()
total_dr=tracks.duration
sns.set_style(style="whitegrid")
fig_dims=(10,5)
fig,ax=plt.subplots(figsize=fig_dims)
fig=sns.lineplot(x=years,y=total_dr,ax=ax).set(title="Year vs Duration")
plt.xticks(rotation=60)
plt.show()