/RussianNovels

A dataset of around 100 Russian novels for stylometric experiments

RussianNovels

A dataset of around 100 Russian novels for stylometric experiments.

Please note that this is NOT a benchmark corpus - the collection includes texts written mostly in the 19th and 20th century by both male and female authors, tested as formatted to work well with stylo (UTF-8 without BOM), but IT IS NOT balanced for genre, number of texts per author and length of the texts -

The model users of this dataset are early adepts of stylometry who want to test their skills and would like to do that with Russian texts. Such users are encouraged to select some (e.g. 20 texts by 5 authors total) of the texts that might be of interest to them and play around using various methods.