/kanteletar

Preprocessing scripts for Kanteletar from Project Gutenberg.

Primary LanguagePython

Kanteletar preprocessing scripts

This repository contains the freely available version of Kanteletar from Project Gutenberg, along with some scripts to fix the erros in the text file and preprocess it either to CSV, or SKVR-like XML format.

Author: Maciej Janicki, University of Helsinki License: public domain