Plan to use https://babelnet.org to identify "entities."
First pass will be naive for loop searching for bi-grams / n-grams.
End goal is to classify each "state-of-the-state" speech by some large overarching "topic(s)" or "theme(s)"
In between First Pass and the End Goal is a lot of code magic.