/004-movie-industry

Using for INT214 Statistics for Information Technology

Primary LanguageRMIT LicenseMIT

004-Movie Industry

Original Dataset from

About the movie industry dataset

This dataset about the movie industry was scraped from IMDb and published on Kaggle and GitHub
by danielgrijalvas. The dataset contains 7,668 movies from 1980 to 2020 (7,668 observations and 15 variables).

Click here for more details.

Overview

From the beginning of the film industry to the present, many movies from different studios are released every year. Producing and shooting one movie requires a huge budget. In terms of business, if we invest in something, what is expected is "profit," and so is the film industry. However, the difference is in the factors that affect the revenue of movies, such as feedbacks, ratings, reviews, film directors, screenwriters, and many more. We selected this dataset to analyze the factors affecting the profits of individual films, changes in movie budgets, duration of the movie, and the revenue of movies over the past four decades. This dataset is obtained from Kaggle, explored in Microsoft Excel, cleaned and analyzed by R language in R Studio. In addition, the data was analyzed to determine the correlation of the data.

Steps

  1. Define questions
  2. Explore data from the original dataset
  3. Data Cleaning and Data Transformation
  4. Exploratory Data Analysis
  5. Analytical Inferential Statistics
  6. Data Visualization

Tools

Languages

Table of Contents

  1. Define questions
  2. Exploratory Data
  3. Cleaning Data and Transformation
  4. Data Analysis
  5. Analytical Inferential Statistics
  6. Data Visualization (BI Dashboard)

Resources

Important Files in Repository

Important Folders in Repository

Folder Link
Exploratory Dataset Click here
Cleaning Dataset Click here
Data Analysis Click here
Hypothesis testing Click here
Data Visualization Click here

About Us

The work is part of INT214 (Statistics for Information technology)
semester 1/2021 School of Information Technology KMUTT

Team: 39,42,46,48,53 Group

No. Name Student ID
1 Denphum Nakglam 63130500039
2 Songglod Petchamras 63130500042
3 Thanakrit Paithun 63130500046
4 Thanaphon Sukkasem 63130500048
5 Thanatorn Roswan 63130500053

Instructors

  • ATCHARA TRAN-U-RAIKUL
  • JATAWAT XIE (Git: safesit23)