Open in gitpod
A project to explore 花言葉 (hanakotoba, lit. flower language) in Japanese and other literary corpora.
The dataset used for the current project was pulled from the following:
- Aozora Bunko Corpus for Japanese full text works
- Hanakotoba for flower names, translations, and associated characteristics
- Wikipedia for conversions of Japanese decimal classification codes (分類番号)
- Wikipedia for a list of major Japanese eras (時代)
- This page for a list of sub-eras (元年) Some of these didn't end up being necessary for the main project but are included with the accompanying code for genre and date conversions
- The main report, compiled with datapane and also in html format
- Historical era dataframe : Jidai.csv
- Sub-era dataframe : Gannen.csv
- Japanese genre code dataframe : Genres.csv
- Dataframe of all flowers/plants and associated characteristics : Hk_df.csv
- Dataframe with all text metainfo, calculated date columns, and tagged flower occurences with locations in the text : All_df.csv