This repository publishes poetic texts in German generated by character-based recurrent neural networks.
At first sight, it seems pointless to publish computer-generated poetic texts, since the computer can generate such texts in infinite numbers. In fact, however, such publication proves to be useful. For example, researchers may need generated texts for analysis, and it is difficult to obtain such texts quickly. The models that generate them require software customization, require certain versions of deep learning frameworks. The availability of trained models itself can also be questionable. These problems can cause a lot of headaches. In addition, the generation process may require special technical skills. This limits the work with such texts to scholars of the humanities.
This repository contains ready-to-use texts.
The models are trained on texts of German Hexameter, on poetry by Friedrich Hölderlin, Theodor Fontane and Paul Celan.
One model was trained on the Celan texts two on the Fontane and Hölderlin texts, and three on the Hexameter texts. Each model has been trained for its own number of epochs and has its own loss value.
Ten samples at least 28,000 characters in length were generated for each model and are presented in this repository.
Models were trained with the code developed by Andrej Karpathy for character-based multi-layer Recurrent Neural Networks (LSTM) in Torch.
Train corpus | Characters | Lines |
---|---|---|
Hölderlin | 415,516 | 10,677 |
Fontane | 365,360 | 10,327 |
Celan | 267,521 | 9,757 |
Hexameter | 605,627 | 12,516 |
Hölderlin's poems were crawled from this web site.
Hexameter lines extracted from large collection of German verses running by Thomas Haider.
Ten samples with different temperature were generated for each model. For an explanation of the temperature concept, see the original Karpathy repository.
Train | Epoch | Loss | Temperature |
---|---|---|---|
Hölderlin | 43.75 | 1.3026 | 0.1 |
Hölderlin | 43.75 | 1.3026 | 0.2 |
Hölderlin | 43.75 | 1.3026 | 0.3 |
Hölderlin | 43.75 | 1.3026 | 0.4 |
Hölderlin | 43.75 | 1.3026 | 0.5 |
Hölderlin | 43.75 | 1.3026 | 0.6 |
Hölderlin | 43.75 | 1.3026 | 0.7 |
Hölderlin | 43.75 | 1.3026 | 0.8 |
Hölderlin | 43.75 | 1.3026 | 0.9 |
Hölderlin | 43.75 | 1.3026 | 1.0 |
Hölderlin | 50.00 | 1.3049 | 0.1 |
Hölderlin | 50.00 | 1.3049 | 0.2 |
Hölderlin | 50.00 | 1.3049 | 0.3 |
Hölderlin | 50.00 | 1.3049 | 0.4 |
Hölderlin | 50.00 | 1.3049 | 0.5 |
Hölderlin | 50.00 | 1.3049 | 0.6 |
Hölderlin | 50.00 | 1.3049 | 0.7 |
Hölderlin | 50.00 | 1.3049 | 0.8 |
Hölderlin | 50.00 | 1.3049 | 0.9 |
Hölderlin | 50.00 | 1.3049 | 1.0 |
Fontane | 42.25 | 1.4736 | 0.1 |
Fontane | 42.25 | 1.4736 | 0.2 |
Fontane | 42.25 | 1.4736 | 0.3 |
Fontane | 42.25 | 1.4736 | 0.4 |
Fontane | 42.25 | 1.4736 | 0.5 |
Fontane | 42.25 | 1.4736 | 0.6 |
Fontane | 42.25 | 1.4736 | 0.7 |
Fontane | 42.25 | 1.4736 | 0.8 |
Fontane | 42.25 | 1.4736 | 0.9 |
Fontane | 42.25 | 1.4736 | 1.0 |
Fontane | 80.00 | 1.5189 | 0.1 |
Fontane | 80.00 | 1.5189 | 0.2 |
Fontane | 80.00 | 1.5189 | 0.3 |
Fontane | 80.00 | 1.5189 | 0.4 |
Fontane | 80.00 | 1.5189 | 0.5 |
Fontane | 80.00 | 1.5189 | 0.6 |
Fontane | 80.00 | 1.5189 | 0.7 |
Fontane | 80.00 | 1.5189 | 0.8 |
Fontane | 80.00 | 1.5189 | 0.9 |
Fontane | 80.00 | 1.5189 | 1.0 |
Celan | 46.30 | 1.5115 | 0.1 |
Celan | 46.30 | 1.5115 | 0.2 |
Celan | 46.30 | 1.5115 | 0.3 |
Celan | 46.30 | 1.5115 | 0.4 |
Celan | 46.30 | 1.5115 | 0.5 |
Celan | 46.30 | 1.5115 | 0.6 |
Celan | 46.30 | 1.5115 | 0.7 |
Celan | 46.30 | 1.5115 | 0.8 |
Celan | 46.30 | 1.5115 | 0.9 |
Celan | 46.30 | 1.5115 | 1.0 |
hexameter | 14.34 | 1.3988 | 0.1 |
hexameter | 14.34 | 1.3988 | 0.2 |
hexameter | 14.34 | 1.3988 | 0.3 |
hexameter | 14.34 | 1.3988 | 0.4 |
hexameter | 14.34 | 1.3988 | 0.5 |
hexameter | 14.34 | 1.3988 | 0.6 |
hexameter | 14.34 | 1.3988 | 0.7 |
hexameter | 14.34 | 1.3988 | 0.8 |
hexameter | 14.34 | 1.3988 | 0.9 |
hexameter | 14.34 | 1.3988 | 1.0 |
hexameter | 43.01 | 1.3479 | 0.1 |
hexameter | 43.01 | 1.3479 | 0.2 |
hexameter | 43.01 | 1.3479 | 0.3 |
hexameter | 43.01 | 1.3479 | 0.4 |
hexameter | 43.01 | 1.3479 | 0.5 |
hexameter | 43.01 | 1.3479 | 0.6 |
hexameter | 43.01 | 1.3479 | 0.7 |
hexameter | 43.01 | 1.3479 | 0.8 |
hexameter | 43.01 | 1.3479 | 0.9 |
hexameter | 43.01 | 1.3479 | 1.0 |
hexameter | 80.00 | 1.3702 | 0.1 |
hexameter | 80.00 | 1.3702 | 0.2 |
hexameter | 80.00 | 1.3702 | 0.3 |
hexameter | 80.00 | 1.3702 | 0.4 |
hexameter | 80.00 | 1.3702 | 0.5 |
hexameter | 80.00 | 1.3702 | 0.6 |
hexameter | 80.00 | 1.3702 | 0.7 |
hexameter | 80.00 | 1.3702 | 0.8 |
hexameter | 80.00 | 1.3702 | 0.9 |
hexameter | 80.00 | 1.3702 | 1.0 |
See also metadata in CSV format.
Hölderlin generation was made for the poet's anniversary in 2020. See paper.
- Der digitale Superdichter. Vor 250 Jahren wurde Friedrich Hölderlin geboren. Heute kann Computertechnik neue Gedichte im Hölderlin-Sound generieren. Ein Werkstattbericht // Die Literarische Welt, 14 March 2020, p. 29.
- Neural reading. Insights from the analysis of poetry generated by artificial neural networks // Orbis Litterarum. 2020. Vol. 75. Number 5. P. 230—246. DOI: 10.1111/oli.12274
Models are published on huggigface:
- Celan doi: 10.57967/hf/2278
- Fontane doi: 10.57967/hf/2279
- Hexameter doi: 10.57967/hf/2281
- Hölderlin doi: 10.57967/hf/2280
If you found this repository useful, please cite it with the URL.
@misc{orekhovboris2020ggpt,
author = {Boris Orekhov},
title = {German Generated Poetic Texts},
howpublished = {\url{https://github.com/nevmenandr/german-generated-poetic-texts}},
year = {2022}
}