- Run the following scripts, and capture the data from Google Ngram Viewer and then use R to export clear graphs. run on the terminal" python getngrams.py , , (e.g. hair-cut,hair cut,haircut) -startYear=1800 -endYear=2008 -smoothing=3>haircut.csv" when the computer is connected with the internet;
- Import *.csv file into R;
- Run the R script, and get the graph.
The corpus includes more than 1000 hyphenated compounds in English with their historical frequencies.
@article{sun2021hyphenation,
title={Hyphenation as a compounding technique in English},
author={Sun, Kun and Baayen, R Harald},
journal={Language Sciences},
volume={83},
pages={101326},
year={2021},
publisher={Elsevier}
}