/NLP-Movie_Scripts

Trying to predict a movie's success based on the script (before filming)

Primary LanguageJupyter Notebook

NLP-Movie_Scripts

Budget data is not going to be very reliable...

Still, with these numbers we can tell which movie succeeded and which movie didn't succeed, and the probability of being wrong is very low. However, the ROI we would get from the above source (or any source) is most likely unrealistic.

Analyzing the actual screen direction too! Anything that doesn't fall into character dialogue and is not crap (would need some filtering) would fall under this category. Example from Braveheart:

Hanging from the rafters of the barn are thirty Scottish noblemen and thirty pages, their faces purple and contorted by the strangulation hanging, their tongues protruding. Malcolm stabs the pitchfork into the ground in useless anger; John still grips the axe as he follows his father through the hanging bodies of the noblemen to the back row, to see the one man in commoner's dress, like theirs...

You can see how this kind of text could be quite relevant to model the movie.