video-game-text-dataset
Collected in-game text like notes, letters, codex entries, and audio recordings into JSON format.
Datasets
The data used to created these datasets was collected from a variety of sources (wikis, transcribing, finding in-game files w/ the data) for LibraryofCodexes. There are bound to be some mistakes but I've tried to sanatize the text the best I can.
All datasets are in JSON format.
Please refer to the individual series folder for more information regarding each series.
- Assassin's Creed
- Baldur's Gate
- Battlefield
- Crysis
- Dead Space
- Destiny
- Deus Ex
- Diablo
- Dishonored
- Doom
- Dragon Age
- Dying Light
- Fable
- Fallout
- Gears of War
- Horizon Zero Dawn
- Kingdoms of Amalur
- Mass Effect
- Metroid Prime
- Middle-Earth
- Nier
- Red Dead Redepmtion
- Resident Evil
- Star Wars: The Old Republic
- System Shock
- The Divison
- The Elder Scrolls
- The Last of Us
- The Witcher
- Tomb Raider
- Watch Dogs
- World of Warcraft
Scientific paper
This repository does not currently have a paper written for it. If you use the data, please use the 'Cite this repository' in the about section.
Games
The datasets were extracted from the following commercial video games. The games and the game assets are copyright the respective game publishers and game developers. If you use the datasets, don't forget to cite the games.
@misc{game:starwarsknightsoftheoldrepublic,
title = {\emph{Star Wars: Knights of the Old Republic}},
year = {2003},
organization = {LucasArts},
publisher = {LucasArts},
author = {{BioWare}},
Howpublished = {Game [PC]},
Note = {LucasArts, San Francisco, US},
}
@misc{gamesseries:tes,
title = {\emph{The Elder Scrolls I-V} and \emph{The Elder Scrolls Online}},
date = {1994/2014},
year = {1994--2014},
organization = {Bethesda Softworks},
publisher = {Bethesda Softworks},
author = {{Bethesda Softworks}},
Howpublished = {Game series [PC]},
Note = {Bethesda Softworks, Rockville, Maryland, US},
}