/msr-lda-wordclouds

Topic Modeling using LDA, and visualization using wordclouds on the Munich Security Report 2017

Primary LanguageJupyter Notebook

msr-lda-wordclouds

Topic Modeling using LDA, and visualization using wordclouds on the Munich Security Report 2017. The text data extracted from the pdf is present in "./munichsr". The script generates word clouds for the top 50 terms in 14 topics from the extracted text using scikitlearn LDA. I chose 14 because there were 14 articles including the Foreword in the report. The generated wordclouds for the topics are as follows:

Topic 0 Topic 1 Topic 2 Topic 3 Topic 4 Topic 5 Topic 6 Topic 7 Topic 8 Topic 9 Topic 10 Topic 11 Topic 12 Topic 13