The works of Shakespeare in a structured form easy to develop with. Surprised I haven't been able to find a source like this elsewhere, hope I'm wrong and there's something really good.
Raw data is in raw
, slightly cleaner data is in csv
, processed output is in json
and html
. If you need help using this, let me know, the code is a hacky mess but the output is alright :)
- Credit to Open Source Shakespeare where it's available as a SQL dump
- The files in
csv
have been derived directly from this, with very minimal cleanup (quotes and apostrophes are fixed). A few of the original tables which were site-specific have been excluded - The files in
json
are separated into a file per 'work' and split into chapters and paragraphs - The files in
html
are generated using the JSON files
Each file has a structure like:
{
"//1": "Open Source Shakespeare work identifier, and the name of the JSON file",
"id": "12night",
"title": "Twelfth Night",
"longTitle": "Twelfth Night, Or What You Will",
"date": "1599",
"genre": "Comedy",
"wordCount": "19837",
"paragraphCount": "1031",
"chapters": [
{
"sectionNumber": 1,
"chapterNumber": 1,
"//2": "description can be null",
"description": "DUKE ORSINO's palace.",
"paragraphs": [
{
"number": 1,
"lines": [
"[Enter DUKE ORSINO, CURIO, and other Lords; Musicians attending]"
],
"//3": "type can be 'STAGE_DIRECTION' or 'DIALOGUE'. will be null for the former",
"type": "STAGE_DIRECTION",
"speaker": null
},
{
"number": 2,
"lines": [
"If music be the food of love, play on;",
"Give me excess of it, that, surfeiting,",
"The appetite may sicken, and so die.",
"That strain again! it had a dying fall:",
"O, it came o'er my ear like the sweet sound,",
"That breathes upon a bank of violets,",
"Stealing and giving odour! Enough; no more:",
"'Tis not so sweet now as it was before.",
"O spirit of love! how quick and fresh art thou,",
"That, notwithstanding thy capacity",
"Receiveth as the sea, nought enters there,",
"Of what validity and pitch soe'er,",
"But falls into abatement and low price,",
"Even in a minute: so full of shapes is fancy",
"That it alone is high fantastical."
],
"type": "DIALOGUE",
"speaker": {
"name": "Orsino",
"abbreviatedName": "DUKE ORSINO",
"appearsInWorks": [
"12night"
],
"description": "Duke of Illyria"
}
}
]
}
]
}
Will hopefully not need to use these instructions 🤞
Install docker and run this to generate CSVs:
rm csv/*.csv
docker run --name shakespeare_db -e MYSQL_USER=root -e MYSQL_ROOT_PASSWORD=root -e MYSQL_DATABASE=shakespeare -v $(pwd):/project -d mysql --secure-file-priv /project/csv
docker exec -w /project shakespeare_db sh -c './scripts/generate_csvs.sh'
This will output them all in ./csv
Generate the CSVs as above
Install nodejs and run this to generate JSON:
npm run generate-json
Then this to generate HTML:
npm run generate-html
- Open Source Shakespeare for providing the data. See: https://opensourceshakespeare.org/downloads/