This repository contains the data generated for the paper titled "Strategies for the Analysis of Large Social Media Corpora: Sampling and Keyword Extraction Methods". We used a large Twitter dataset to extract keywirds using two different methods (SketchEngine and TextRank) and sample sizes.
If you use the data in this repository please cite the paper:
@article{moreno-ortizStrategiesAnalysisLarge2023,
title = {Strategies for the {{Analysis}} of {{Large Social Media Corpora}}: {{Sampling}} and {{Keyword Extraction Methods}}},
author = {{Moreno-Ortiz}, Antonio and {Garc{\'i}a-G{\'a}mez}, Mar{\'i}a},
year = {2023},
month = sep,
journal = {Corpus Pragmatics},
volume = {7},
number = {3},
pages = {241--265},
issn = {2509-9515},
doi = {10.1007/s41701-023-00143-0}}
- Antonio Moreno-Ortiz @Diverking