/MarkovTextGenerator

Markov Text Generator: a finite-context model based text generator.

Primary LanguageTeX

Markov Text Generator

Description

The programs implemented in C++ have the purpose of collecting statistical information about texts using Markov (finite-context) models, and of automatically producing texts that follow the models built.

Along with the description of the solution, we also discuss the effects of the variation of the programs' parameters and attempt to compare different types of texts by the amount of information they hold on average.

Main Programs

  • fcm: Parses a list of files to train the model and presents the entropy of the texts on those files.

  • generator: Parses a list of files to train the model and generates text to a file based on the occurrences statistics of contexts and letters after a context.

Instructions to Build and Run

  • fcm

    Compile: make fcm

    Run: ./fcm 3 0.5 input/small_alice_oz.txt

  • generator

    Compile: make generator

    Run: ./generator 4 0.5 "init" 100 outputFile.txt input/small_alice_oz.txt

Both fcm and generator programs have mandatory arguments. To have information about them use the -h option.

To generate the html documentation run make docs.

Repository Structure

/input - textual dataset used for model training

/output - knowledge gathered from the output of the main programs (graphs, entropy values for different parameters, generated texts)

/report - written report explaining our design choices, presenting results from parameter variation and conclusions of those results

/src - source code, written in C/C++

Authors

The authors of this repository are André Pedrosa, Filipe Pires and João Alegria, and the project was developed for the Algorithmic Theory of Information Course of the Master's degree in Informatics Engineering of the University of Aveiro.

For further information, please read our report.