/headline-project

Creating a newspaper headline corpus

Primary LanguagePython

headline-project

Creating a newspaper headline corpus

Newspaper corpora as of 07.02.2021 (no preprocessing)

  • Handelsblatt (handelsblatt): 2346 headlines
    • 2013-2021
    • over- and sublines incomplete
  • Frankfurter Allgemeine (faz): 4825 headlines
    • 2013 - 2021
    • over- and sublines incomplete
  • DIE WELT (welt): 3697 headlines
    • 2013 - 2021
    • overlines complete, no sublines
  • neues deutschland (nd): 1603 headlines
    • 2013 - 2021
    • overlines incomplete, no sublines
  • die tageszeitung (taz): 585 Headlines
    • 2019 - 2021
    • overlines incomplete, no sublines
  • JUNGE FREIHEIT (jf): 1502 headlines
    • 2013 - 2021
    • overlines incomplete, no sublines
  • Bild (bild): 3222 headlines
    • 2013 - 2021
    • no overlines, no sublines
  • Süddeutsche Zeitung (sueddeutsche): 1143 headlines
    • 2020 - 2021
    • overlines complete, no sublines