Title is Genre Insights and Content Composition Analysis on an Entertainment Platform #Project overview In this project, I aim to provide valuable insights to Microsoft as they embark on establishing a new movie studio. Leveraging on exploratory data analysis techniques, we will delve into various movie datasets to understand current trends in the box office. By examining factors such as movie genres, ratings, and box office performance, we will generate actionable recommendations to guide Microsoft in making informed decisions regarding their movie production strategy #Business understanding Microsoft recognizes the growing importance of original video content in the entertainment industry and seeks to capitalize on this trend by venturing into movie production. However, lacking prior experience in the film industry, Microsoft aims to gain insights into the types of films that resonate well with audiences and perform strongly at the box office. By understanding market trends and audience preferences, Microsoft can strategically position themselves to maximize the success of their movie ventures #problem statement Microsoft is entering the movie production industry without prior expertise and seeks guidance on the types of films that are currently successful at the box office. The challenge lies in analyzing extensive movie data to identify key trends and translate these findings into actionable insights that will inform Microsoft's movie production strategy #four objectives
- Evaluate the Relationship Between IMDb Ratings and Audience Engagement.
- Identify High-Performing Titles Across Different Categories
- Analyze Category Contributions to Overall Ratings and Engagement
- Detect Patterns or Trends in Audience Preferences
#code snipet
plt.figure(figsize=(12, 8)) sns.histplot(project_1['averagerating'], bins=30, kde=True, color='skyblue') plt.title('Distribution of IMDb Ratings') plt.xlabel('IMDb Rating') plt.ylabel('Frequency') plt.show()
##IMDb ratings vs number of votes plt.figure(figsize=(12, 8)) sns.scatterplot(x='numvotes', y='averagerating', data=project_1, hue='category', alpha=0.7) plt.title('Bivariate Analysis: IMDb Ratings vs Number of Votes') plt.xlabel('Number of Votes') plt.ylabel('IMDb Rating') plt.legend(title='Category', bbox_to_anchor=(1.05, 1), loc='upper left') plt.show() #percentage occurence #count occurence of each genre genre_counts = pd.Series({'Action': 30, 'Comedy': 20, 'Drama': 25, 'Thriller': 15, 'Horror': 10})
explode = [0.1] * len(genre_counts)
colors = plt.cm.Set3.colors # Use Set3 color map for better visibility
plt.pie(genre_counts, labels=genre_counts.index, autopct='%1.1f%%', startangle=90, colors=colors, explode=explode)
centre_circle = plt.Circle((0,0),0.70,fc='white') fig = plt.gcf() fig.gca().add_artist(centre_circle)
plt.axis('equal') # Equal aspect ratio ensures that pie is drawn as a circle. plt.show()
plt.figure(figsize=(10, 6)) sns.scatterplot(x='numvotes', y='averagerating', data=project_1, alpha=0.5) plt.title('Scatter Plot: Average Rating vs. Number of Votes') plt.xlabel('Number of Votes') plt.ylabel('Average Rating') plt.show() #RECOMMENDATIONS Offer Variety: Show different types of movies to attract more people. Show Popular Movies: Focus on the types of movies that people like the most. Keep Up with Trends: Stay updated on what kinds of movies are becoming popular. Pick Movies People Like: Choose movies that match what people in the area like to watch. Include Everyone: Show movies that everyone can enjoy, with diverse actors and stories