/lwm_GIR19_resolving_places

Repository for code underlying the paper 'Resolving Places, Past and Present: Toponym Resolution in Historical British Newspapers Using Multiple Resources'.

Primary LanguageJupyter NotebookCreative Commons Attribution 4.0 InternationalCC-BY-4.0

DOI

GIR19 paper

This repository provides underlying code for the paper 'Resolving Places, Past and Present: Toponym Resolution in Historical British Newspapers Using Multiple Resources'.

Citation

If you use or adapt this code in your paper, please use this citation;

Mariona Coll Ardanuy, Katherine McDonough, Amrey Krause, Daniel CS Wilson, Kasra Hosseini, Daniel van Strien, Resolving Places, Past and Present: Toponym Resolution in Historical British Newspapers Using Multiple Resources. In Proceedings of the 13th Workshop on Geographic Information Retrieval (GIR'19), Forthcoming.

What is this?

Resolving Places is one of the first outputs of Living with Machines, a collaborative digital history project at The Alan Turing Institute and the British Library. This research is part of our work to build a nineteenth-century gazetteer that combines place names derived from historical sources (GB1900) with online resources (Wikipedia and Geonames). GB1900 is the result of a crowdsourced project that transcribed all text labels on the 2nd edition 6-inch to 1 mile Ordnance Survey maps of Great Britain (ca. 1900) held by the National Library of Scotland (NLS Maps online).

The Living with Machines gazetteer follows best practices in combining multiple existing resources, and is novel in accounting for places that have different scales (e.g. streets, buildings, cities, counties). In the future, we will be adding records and enriching current records with information from OS map 1st edition map label data and other sources.

High-resolution figures

Download figures of our GIR19 paper in high-resolution:

Figure 1 (3 MB)

Figure 2 (12.1 MB)

Figure 3 (7 MB)

Figure 4 (2.6 MB)

Creating your own wikiGazetteer and reproducing the analysis

To setup your own version of wikiGazetteer and reproducing the analysis in the paper please;

  • install the required packages via Anaconda (see below for instructions)

Install the required packages

  1. Install Anaconda following these instructions.

  2. Create gir19 environment:

conda env create -f environment.yml
  1. Activate gir19 environment:
source activate gir19

Install MySQL

Creating your own wikiGazetteer

Full instructions outline the steps to create a wikiGazetteer yourself. You will need MySQL installed:

  • The steps for installing MySQL will vary by platform. A good starting place will by the MySQL documentation.
  • The GIR code makes use of mysql-connector-python to connect to the MySQL server. This should have been installed in the environment you created above.

MySQL Authentication

  • The code in this repo has a default username and password (see here and here) for connecting to MySQL you will need to change this if you setup your MySQL server with a different password .

Future work and contributing

The authors of the paper plan to continue development of the code and extension of the Gazetteer. We welcome pull requests for improvements and issues for any errors you encounter.

Get in touch

You can reach us by email:

  • Mariona Coll Ardanuy, mcollardanuy[at]turing.ac.uk
  • Katherine McDonough, kmcdonough[at]turing.ac.uk
  • Amrey Krause, akrause[at]turing.ac.uk
  • Daniel CS Wilson, dwilson[at]turing.ac.uk
  • Kasra Hosseini, khossienizad[at]turing.ac.uk
  • Daniel van Strien, Daniel.Van-Strien[at]bl.uk

Acknowledgements

This work is part of the Living with Machines project. Living with Machines is a multidisciplinary programme funded by the Strategic Priority Fund which is led by UK Research and Innovation (UKRI) and delivered by the Arts and Humanities Research Council (AHRC). Newspaper data was kindly shared by Findmypast. We thank Chris Fleet and the National Library of Scotland for sharing digital map images and metadata for OS collections as well as context about GB1900. Thank you also to Humphrey Southall, Paula Aucott, and Richard Light for discussing the future of GB1900.

License

Shield: CC BY 4.0

This work is licensed under a Creative Commons Attribution 4.0 International License.

CC BY 4.0