/TQPD-Czech-Parliament

Towards quantifying parliamentary discourse - code

Primary LanguageR

TQPD-Czech-Parliament

Towards quantifying parliamentary discourse (2019) - updated repository. Older versions can be found here.

What's in here

  • code to scrape 9 years of Czech parliamentary discussion, consisting of 384,430 speeches. Scraped from: http://www.psp.cz/eknih/
  • tidying, structuring, stop word removal and adding metadata about MPs
  • tokenization and lemmatization using udpipe
  • lda using vowpal wabbit
  • measuring topic usage over time using Kullback–Leibler divergence.
    • novelty: how different (surprising) a speech was from the last w ones
    • resonance: how lasting (impactful) a speech was over the next w ones
  • modelling the relationship between a speech's novelty and resonance using linear mixed-effect models.
  • a few plots

alt text

alt text

Paper introducing the novelty, transience and resonance measures: Barron, Huang, Spang & DeDeo (2018).
Many thanks to Malte Lau Petersen!