ryancahildebrandt/hanakotoba

Exploring 花言葉 in Japanese and other literary corpora

HTML

Literature in Bloom

Open in gitpod

Purpose

A project to explore 花言葉 (hanakotoba, lit. flower language) in Japanese and other literary corpora.

Dataset

The dataset used for the current project was pulled from the following:

Aozora Bunko Corpus for Japanese full text works
Hanakotoba for flower names, translations, and associated characteristics
Wikipedia for conversions of Japanese decimal classification codes (分類番号)
Wikipedia for a list of major Japanese eras (時代)
This page for a list of sub-eras (元年) Some of these didn't end up being necessary for the main project but are included with the accompanying code for genre and date conversions

Outputs

The main report, compiled with datapane and also in html format
Historical era dataframe : Jidai.csv
Sub-era dataframe : Gannen.csv
Japanese genre code dataframe : Genres.csv
Dataframe of all flowers/plants and associated characteristics : Hk_df.csv
Dataframe with all text metainfo, calculated date columns, and tagged flower occurences with locations in the text : All_df.csv