
Word frequencies for all the texts (~10,600) in `as is` and `normalized versions`


Word frequencies for all the texts (~10,600) in as is and normalized versions

  • 1gram : forlder with word frequencies for each text (as is—no normalization of orthography);

  • 1gram_NRM : folder with word frequencies for each text (normalizedalifs simplified into one form; carriers of medial and finals hamzaŧs removed);

  • 1gram_NRM_Lengths : file with lengths of each text in words