This repo contains code for my 2022 entry in National Novel Generation Month, a challenge to write a computer program that generates a novel of at least 50,000 words.
My idea is reverse indexing: training language model to generate a book from its index. As training data, I used the index from Adam Smith's The Wealth of Nations, which I already digitized for an earlier project. I used this to finetune a version of GPT-2-xl, such that it can generate a page given its index headings.
As a first experiment after I finetuned the model, I tried getting it to reconstruct Smith's book from the index; you can see the results in the file "the-wealth-of-nations-generated.txt."
For my actual entry, I will be generating a novel based on this index, which I wrote myself. I used PromptArray, a prompting language I developed, to induce the generator to write less like Adam Smith and more like a novel.
The index I wrote is based loosely Joseph Campbell's account, in The Hero with a Thousand Faces of the monomyth: a structure that (he claimed) is common to heroic narratives in many cultures. It's a somewhat worn-out idea, but I picked it for precisely that reason. Text generators are in some way chewing up and regurgitating the texts in the training data, so I wanted to generate a story that foregrounds its debt to old conventions for what a story should contain.
I also took inspiration from a passage in Michel Foucault's The Order of Things, in which he points out that some early-modern bestiaries mixed together facts about what we would now consider to be natural qualities of animals (size, color, shape) with facts about the values humans assign to them (role in mythology, heraldic meaning), drawing no fundamental distinction between the two.
I wanted to generate a text that, similarly, draws no clear line between narration of a story and critical commentary about that story. The entries I wrote for major characters therefore promise not just descriptions and actions, but also comments about the characters' resemblance to various cultural sources, criticisms of how the characters are depicted, and accusations of plagiarism. I am aiming not so much at self-reflexive metafiction as a style that declines to draw any line between fiction and metafiction.
I generated two different versions: version 1; version 2. The first is somewhat more coherent, but the second hews closer to the index and has a cutup quality that is enjoyable in its own way. I consider the second to by my NaNoGenMo entry.
If you want to run this code, you will need to download a copy of the PromptArray in this directory.