Living with Machines
A radical collaboration between computational linguists, curators, data scientists, software engineers, geographers and historians
United Kingdom
Pinned Repositories
alto2txt
Convert ALTO XML to plain text + minimal metadata
D3_JS_viz_in_a_Python_Jupyter_notebook
Tutorial code showing how to put a D3 JavaScript visualisation in a Python Jupyter notebook.
deduplify
A Python tool to search for and remove duplicated files in messy datasets
DeezyMatch
A Flexible Deep Learning Approach to Fuzzy String Matching
dhoxss-text2tech
Materials for the Text to Tech workshop at the Digital Humanities Oxford Summer School
DiachronicEmb-BigHistData
Tools to train and explore diachronic word embeddings from Big Historical Data
genre-classification
Jupyter book showing how to build an ML powered book genre classifier
histLM
Neural Language Models for Historical Research
lwm_ARTIDIGH_2020_OCR_impact_downstream_NLP_tasks
Repository for code underlying the paper 'Assessing the Impact of OCR Quality on Downstream NLP Tasks'
nnanno
nnanno is a collection of tools that sample, annotate and apply computer vision to the Newspaper Navigator dataset
Living with Machines's Repositories
Living-with-machines/DeezyMatch
A Flexible Deep Learning Approach to Fuzzy String Matching
Living-with-machines/histLM
Neural Language Models for Historical Research
Living-with-machines/DiachronicEmb-BigHistData
Tools to train and explore diachronic word embeddings from Big Historical Data
Living-with-machines/nnanno
nnanno is a collection of tools that sample, annotate and apply computer vision to the Newspaper Navigator dataset
Living-with-machines/deduplify
A Python tool to search for and remove duplicated files in messy datasets
Living-with-machines/alto2txt
Convert ALTO XML to plain text + minimal metadata
Living-with-machines/dhoxss-text2tech
Materials for the Text to Tech workshop at the Digital Humanities Oxford Summer School
Living-with-machines/genre-classification
Jupyter book showing how to build an ML powered book genre classifier
Living-with-machines/lwm_ARTIDIGH_2020_OCR_impact_downstream_NLP_tasks
Repository for code underlying the paper 'Assessing the Impact of OCR Quality on Downstream NLP Tasks'
Living-with-machines/Computer-Vision-for-the-Humanities-workshop
Computer Vision for the Humanities workshop
Living-with-machines/PressPicker
An interactive visualisation tool for picking newspaper titles
Living-with-machines/T-Res
A Toponym Resolution Pipeline for Digitised Historical Newspapers
Living-with-machines/DeezyMatch_tutorials
Collection of tutorials for DeezyMatch (https://github.com/Living-with-machines/DeezyMatch)
Living-with-machines/station-to-station
This repository provides underlying code and materials for the paper 'Station to Station: Linking and Enriching Historical British Railway Data'.
Living-with-machines/label-studio-converter
Create ready-to-use Label Studio pre-populated JSON files from popular OCR formats.
Living-with-machines/AtypicalAnimacy
Repository for code underlying the paper 'Living Machines: A Study of Atypical Animacy' (COLING2020)
Living-with-machines/LwM_SIGSPATIAL2020_ToponymMatching
Repository for code underlying the paper 'A Deep Learning Approach to Geographical Candidate Selection through Toponym Matching'.
Living-with-machines/lwmdb
A django-based library for managing the Living with Machines newspapers metadata database schema
Living-with-machines/CensusGeocoder
Geocode Historic Great British Census Data 1851-1911
Living-with-machines/ERWT
Living-with-machines/TargetedSenseDisambiguation
Repository for the work on Targeted Sense Disambiguation
Living-with-machines/accidents-interactive
This is the “accidents interactive” for the Living with Machines exhibit at Leeds City Museum 2022–23.
Living-with-machines/alto2txt2fixture
Converts metadata from alto2txt into JSON data with corresponding relational IDs for ingestion into a relational database
Living-with-machines/dated-translator
A Python package that helps translate from one term to another, depending on a passed date, from a CSV that contains some verified information.
Living-with-machines/gh_orgstats
GitHub stats for Github Organizations
Living-with-machines/jisc-wrangler
Tool for restructuring data in the JISC 19th Century British Library Newspaper collection
Living-with-machines/machines-interactive
This is the “machines interactive” for the Living with Machines exhibit at Leeds City Museum 2022–23.
Living-with-machines/newspapers
Public repository for open access material relating to historical newspapers
Living-with-machines/PressDirectories
Code and Data for Mitchell's Newspaper Press Directories
Living-with-machines/zoonyper
Code to make it easy to import and process Zooniverse annotations and their metadata in Python/Jupyter Notebooks