The server uses a RESTful API, with GET and PUT requests.
/sentences
shows all sentences used in all stories (GET){"data": [{"sentence_id": 1, "sentence": "December in Londen, een koude mist."}, {"sentence_id": 2, "sentence": "Een wit waas om White-Rose, in de achterkamer een groot vuur."}
/sentence/<sentence_id>
shows the sentence with that id (GET)
{"data": [{"sentence_id": 2, "sentence": "Een wit waas om White-Rose, in de achterkamer een groot vuur."}]}
/addsentence/<story_id>
adds a sentence to a story (PUT), i.e.curl -X PUT http://localhost:5003/addsentence/3 -d data="Het sneeuwde met groote vlokken."
{"status": "sentence added"}
/stories
shows all stories with their sentences and their positions (GET)
{"data": [{"story_id": 1, "sentence_id": 6, "position": 1}, {"story_id": 1, "sentence_id": 7, "position": 2}, {"story_id": 2, "sentence_id": 3, "position": 1}, {"story_id": 1, "sentence_id": 8, "position": 3}, {"story_id": 2, "sentence_id": 5, "position": 3}, {"story_id": 2, "sentence_id": 4, "position": 2}, {"story_id": 3, "sentence_id": 11, "position": 1}, {"story_id": 17, "sentence_id": 14, "position": 1}]}
/story/<story_id>
shows the story with its sentences and positions (GET)
{"data": [{"story_id": 2, "sentence_id": 3, "position": 1}, {"story_id": 2, "sentence_id": 5, "position": 3}, {"story_id": 2, "sentence_id": 4, "position": 2}]}
/createstory<author_id>
creates a new (empty) story for anauthor_id
(GET)
/authors
shows a list of all available authors (GET)
{"data": [{"author_id": 1, "name": "Louis Couperus"}, {"author_id": 2, "name": "Annie M.G. Schmidt"}]}
/votes
shows all votes that have been cast (GET)
{"data": [{"sentence_id": 12, "vote_id": 4}, {"sentence_id": 13, "vote_id": 5}, {"sentence_id": 14, "vote_id": 6}]}
/vote/<suggestion_id>
cast vote for suggestion, which turns the suggestion into a sentence, and is added to the story (PUT)
curl -X PUT http://localhost:3041/vote/65 -d data=""
{
"vote_id": 65
}
/vote/<suggestion_id>
shows the vote info
/suggestion/<story_id>
get a list of suggestions on how to continue with the story (GET)
[
{
"sentence": "Stt ...",
"suggestion_id": 20
},
{
"sentence": "Oneindig verstand , zegt hij in een heftige sc\u00e8ne van opstand uitbarstte , en vloekte ...",
"suggestion_id": 21
},
{
"sentence": "Diego een bizondere pozitie hebben bekleed : edelman en bemiddeld kreeg hij er zelfs een formeele strijd tusschen hem en haar zelf , en 's nachts , 's morgens .",
"suggestion_id": 22
},
{
"sentence": "Vier-en-twintig lictoren , de omkranste ro\u00eabundels-en-bijlen torsend , omstuwen de statie-kar , aan beide zijden verdeelde en tegen de regenten , en bijgevolg in voortdurende wording en herwording is .",
"suggestion_id": 23
},
{
"sentence": "W\u00ecl u nu wat eten ....",
"suggestion_id": 24
},
{
"sentence": "Blauw Paradijs , wat waren het mooie , het edele , opofferende van hare lichtzinnige daad .",
"suggestion_id": 25
}
]
Script for generating text in the style of the oeuvre that is added as argument (in plain text).
Steps that the script takes:
A. Train on text:
- split in sentences (use list of abbreviations used in corpus)
- save sentence-initial and sentence-final words separately
- split the text in words, save all sequences of 2 words + 1 word (bigram model), and of 1 word + 1 word (unigram model)
B. Generate new text:
- each new paragraph starts with a word that occurs as begin-of-sentence in the training text
- start generating sentences until the minimum paragraph length has reached
- the start of a sentence is a random word based on the last two words of the previous sentence.
- words are generated randomly using the previous two words in the sentence (bigram model). If those do not exist in the style dictionary, use only the previous word (unigram model).
- words are generated until a sentence ending is encountered.
Minimal paragraph length and number of paragraphs are user-defined in the script (TODO: redefine as arguments)
The output is printed to inputfile.random[0-9]+
Pre-training models:
python generate_features.py /vol/bigdata2/datasets2/WritersintheCloud/kellendonktxt/kellendonk.feats.json /vol/customopt/lamachine/share/frog/nld/frog.cfg /vol/bigdata2/datasets2/WritersintheCloud/kellendonktxt/Verzameld\ werk\ -\ Frans\ Kellendonk.txt
python train_lm.py /vol/bigdata2/datasets2/WritersintheCloud/kellendonktxt/kellendonk.feats.json /vol/bigdata2/datasets2/WritersintheCloud/kellendonktxt/kellendonk.lm.json
Online text generation: