/QTA_R_course

Material for the 2019 CDSS course "Quantitative Text Analysis with R"

QTA_R_course

Material for the 2019 CDSS course "Quantitative Text Analysis with R"

Instructor

Julian Bernauer is a Postdoctoral Fellow at the Data and Methods Unit of the MZES. He is currently working on a research project measuring populism from political text and generally interested in Data Science.

Course Abstract

The course "Quantitative Text Analysis" provides an introduction to the retrieval, preparation, visualization and analysis of text as data using R. We draw on social science and other text examples, namely European election manifestos, books by Mark Twain, large amounts of Tweets and others. The course covers some web scraping to obtain text, preparation including the construction of word frequency matrixes or dictionaries and visualization tools beyond word clouds. For the analysis of texts, topic models such as LDA (latent Dirichlet allocation), scaling models including Wordscores and Wordfish as well as alternatives based on natural language processing tools (e.g. word embeddings) are discussed. One further theme is the cross-lingual and -contextual analysis of text. The participants also have the opportunity of helping to shape a textbook on the topic, which is contracted with SAGE and scheduled to appear in 2020.