/Russian_frequency_lists

Collection of word lists with frequencies

Creative Commons Zero v1.0 UniversalCC0-1.0

Russian_frequency_lists_for_children

This repository contains word lists that ave been created from several corpora.

Wordlist_Detcorpus_50000 is a list of 50 000 lemmas with their frequencies from DetCorpus - corpus of Russian literature for children, including more than 2,097 prose works written in Russian between the 1920s and 2010s and aimed at children and adolescents.

Wordlist_Detcorpus_nonfiction is a list of the 20 000 most frequent lemmas from the non-fiction subcorpus of DetCorpus.

Columns in the word lists

lemma is the normalized word forms, lemmatization made by Mystem analyzer. abs_frequency is the raw, absolute frequency value showing how many times lemma occurs in the corpus. ipm (items per million) is the normalized frequency value.