Two interlocking experiments on literary change, which set out to illuminate the relative importance of cohort effects, event-driven period effects, and longitudinal change across an individual writer's career.
Ted Underwood, Kevin Kiley, Wenyi Shang, Stephen Vaisey.
The two experiments are:
-
A regression experiment investigates the relationship between cohort effects and period effects in fiction 1890-1989.
-
Structural equation models compare the relative importance of active updating and settled dispositions across individual careers.
The project has a preregistration on the Open Science Framework at 10.17605/OSF.IO/4E2K7.
Broadly speaking, metadata construction is documented in /dataconstruction
, and data construction in /get_texts
. The final state of the metadata is in /metadata/finalcorpus.tsv
.
The development of a topic model is documented in /modelselection,
and the process of coding topics in /interrater.
The topics are most fully documented in topic_summary.tsv
(which contains both coding performed before the experiment, and the results of experiments 1 and 2 above).
For (scrambled) texts of the documents modeled, and a full doc-topics file covering all the texts at "chunk level," see the Zenodo dataset "Topic model of English-language fiction, 1880-1999, with 200 topics." Those files are much too large for a git repository.
The /regression
folder documents our regression experiment, and the /sem-topics
folder documents structural equation modeling on sequences across a single writer's career.
/tripletdistance
is an alternate way of thinking about change across individual careers.