Marshall A. Taylor & Dustin S. Stoltz
This repository contains all R code necessary to reproduce the analysis in "A Workflow for Analyzing Cultural Schemas in Texts," forthcoming in The Journal of Mathematical Sociology.
Concept class analysis (CoCA) is a method for recovering cultural schemas in texts using a combination of word embedding and community detection models. Like survey-based forms of schematic class analysis (SCA), however, interpreting results can be difficult. Some of these interpretive difficulties are applicable across types of SCA, while others are unique to CoCA. In this paper, we propose a complete workflow for interpreting and analyzing CoCA output. We use the case of social identity schemas in a collection of over 13,000 U.S. political blog posts to outline a number of interpretive and analytical strategies and a robustness check to make sense of the cultural schemas recovered from texts.