/mpaa_ml

Build an ML model to predict the MPAA content rating of a movie given the full script

Primary LanguageJupyter Notebook

Predicting MPAA Content Ratings from Script Content

This serves a the final project to the UC Berkeley extension course COMPSCI X433.6: Introduction to Machine Learning Using Python

Objective

The final project is open ended. I chose to attempt to predict MPAA content ratings (G, PG, PG-13, R) given a complete movie script. The accuracy hovers around 70% for the two best classifiers I trained. With more judicious feature selection, this could probably be improved.

See the pdf file in this repo for the complete explanation. All data and work is present in this repo. In case I decide to go back and improve this classifier, the final project submission will forever have the 'final_submission' git tag.