/german-generated-poetic-texts

This repository publishes poetic texts in German generated by character-based recurrent neural network

Creative Commons Zero v1.0 UniversalCC0-1.0

DOI

German Generated Poetic Texts - GGPT

Goal and content

This repository publishes poetic texts in German generated by character-based recurrent neural networks.

At first sight, it seems pointless to publish computer-generated poetic texts, since the computer can generate such texts in infinite numbers. In fact, however, such publication proves to be useful. For example, researchers may need generated texts for analysis, and it is difficult to obtain such texts quickly. The models that generate them require software customization, require certain versions of deep learning frameworks. The availability of trained models itself can also be questionable. These problems can cause a lot of headaches. In addition, the generation process may require special technical skills. This limits the work with such texts to scholars of the humanities.

This repository contains ready-to-use texts.

The models are trained on texts of German Hexameter, on poetry by Friedrich Hölderlin, Theodor Fontane and Paul Celan.

One model was trained on the Celan texts two on the Fontane and Hölderlin texts, and three on the Hexameter texts. Each model has been trained for its own number of epochs and has its own loss value.

Ten samples at least 28,000 characters in length were generated for each model and are presented in this repository.

Neural network architecture

Models were trained with the code developed by Andrej Karpathy for character-based multi-layer Recurrent Neural Networks (LSTM) in Torch.

Train sets

Train corpus Characters Lines
Hölderlin 415,516 10,677
Fontane 365,360 10,327
Celan 267,521 9,757
Hexameter 605,627 12,516

Hölderlin's poems were crawled from this web site.

Hexameter lines extracted from large collection of German verses running by Thomas Haider.

Data

Ten samples with different temperature were generated for each model. For an explanation of the temperature concept, see the original Karpathy repository.

Train Epoch Loss Temperature
Hölderlin 43.75 1.3026 0.1
Hölderlin 43.75 1.3026 0.2
Hölderlin 43.75 1.3026 0.3
Hölderlin 43.75 1.3026 0.4
Hölderlin 43.75 1.3026 0.5
Hölderlin 43.75 1.3026 0.6
Hölderlin 43.75 1.3026 0.7
Hölderlin 43.75 1.3026 0.8
Hölderlin 43.75 1.3026 0.9
Hölderlin 43.75 1.3026 1.0
Hölderlin 50.00 1.3049 0.1
Hölderlin 50.00 1.3049 0.2
Hölderlin 50.00 1.3049 0.3
Hölderlin 50.00 1.3049 0.4
Hölderlin 50.00 1.3049 0.5
Hölderlin 50.00 1.3049 0.6
Hölderlin 50.00 1.3049 0.7
Hölderlin 50.00 1.3049 0.8
Hölderlin 50.00 1.3049 0.9
Hölderlin 50.00 1.3049 1.0
Fontane 42.25 1.4736 0.1
Fontane 42.25 1.4736 0.2
Fontane 42.25 1.4736 0.3
Fontane 42.25 1.4736 0.4
Fontane 42.25 1.4736 0.5
Fontane 42.25 1.4736 0.6
Fontane 42.25 1.4736 0.7
Fontane 42.25 1.4736 0.8
Fontane 42.25 1.4736 0.9
Fontane 42.25 1.4736 1.0
Fontane 80.00 1.5189 0.1
Fontane 80.00 1.5189 0.2
Fontane 80.00 1.5189 0.3
Fontane 80.00 1.5189 0.4
Fontane 80.00 1.5189 0.5
Fontane 80.00 1.5189 0.6
Fontane 80.00 1.5189 0.7
Fontane 80.00 1.5189 0.8
Fontane 80.00 1.5189 0.9
Fontane 80.00 1.5189 1.0
Celan 46.30 1.5115 0.1
Celan 46.30 1.5115 0.2
Celan 46.30 1.5115 0.3
Celan 46.30 1.5115 0.4
Celan 46.30 1.5115 0.5
Celan 46.30 1.5115 0.6
Celan 46.30 1.5115 0.7
Celan 46.30 1.5115 0.8
Celan 46.30 1.5115 0.9
Celan 46.30 1.5115 1.0
hexameter 14.34 1.3988 0.1
hexameter 14.34 1.3988 0.2
hexameter 14.34 1.3988 0.3
hexameter 14.34 1.3988 0.4
hexameter 14.34 1.3988 0.5
hexameter 14.34 1.3988 0.6
hexameter 14.34 1.3988 0.7
hexameter 14.34 1.3988 0.8
hexameter 14.34 1.3988 0.9
hexameter 14.34 1.3988 1.0
hexameter 43.01 1.3479 0.1
hexameter 43.01 1.3479 0.2
hexameter 43.01 1.3479 0.3
hexameter 43.01 1.3479 0.4
hexameter 43.01 1.3479 0.5
hexameter 43.01 1.3479 0.6
hexameter 43.01 1.3479 0.7
hexameter 43.01 1.3479 0.8
hexameter 43.01 1.3479 0.9
hexameter 43.01 1.3479 1.0
hexameter 80.00 1.3702 0.1
hexameter 80.00 1.3702 0.2
hexameter 80.00 1.3702 0.3
hexameter 80.00 1.3702 0.4
hexameter 80.00 1.3702 0.5
hexameter 80.00 1.3702 0.6
hexameter 80.00 1.3702 0.7
hexameter 80.00 1.3702 0.8
hexameter 80.00 1.3702 0.9
hexameter 80.00 1.3702 1.0

See also metadata in CSV format.

Papers

Hölderlin generation was made for the poet's anniversary in 2020. See paper.

Models

Models are published on huggigface:

Citation

If you found this repository useful, please cite it with the URL.

@misc{orekhovboris2020ggpt,
    author = {Boris Orekhov},
    title = {German Generated Poetic Texts},
    howpublished = {\url{https://github.com/nevmenandr/german-generated-poetic-texts}},
    year = {2022}
}