/compound-in-history

Hyphenated compounds corpus and their historical frequencies

Primary LanguagePython

Hyphenation works as compounding tech

Run the scripts to capture historical frequencies and plot them

  1. Run the following scripts, and capture the data from Google Ngram Viewer and then use R to export clear graphs. run on the terminal" python getngrams.py , , (e.g. hair-cut,hair cut,haircut) -startYear=1800 -endYear=2008 -smoothing=3>haircut.csv" when the computer is connected with the internet;
  2. Import *.csv file into R;
  3. Run the R script, and get the graph.

Historical hyphenated compounds corpus

The corpus includes more than 1000 hyphenated compounds in English with their historical frequencies.

Cite:

@article{sun2021hyphenation,
  title={Hyphenation as a compounding technique in English},
  author={Sun, Kun and Baayen, R Harald},
  journal={Language Sciences},
  volume={83},
  pages={101326},
  year={2021},
  publisher={Elsevier}
}