historical-newspapers
There are 14 repositories under historical-newspapers topic.
lorellav/GeoNewsMiner
The GeoNewsMiner (GNM): An interactive spatial humanities tool to visualize geographical references in historical newspapers
Living-with-machines/alto2txt
Convert ALTO XML to plain text + minimal metadata
ieg-dhr/NLP-Course4Humanities_2024
This repository is part of an NLP course for humanities and cultural studies. This course uses historical newspapers as a source and applies NLP methods to them. NLP tasks: Tokenization, Lemmatization, TF-IDF, Part-of-speech tagging, semantic search with transformers, article extraction and OCR post-correction with LLMs, NER and text classification
Living-with-machines/T-Res
A Toponym Resolution Pipeline for Digitised Historical Newspapers
impresso/impresso-text-acquisition
🛠️ Python library to import OCR data in various formats into the canonical JSON format defined by the Impresso project.
Duke-Chronicle-Project/awesome-historical-newspaper-analysis
Awesome historical newspaper analysis tools and literature
OlivierBinette/TessTools
Tools for the use of Tesseract OCR in R
impresso/impresso-schemas
Repository of JSON schemas used in the Impresso project.
felixgiov/public-meeting
Dataset from the paper "Information Extraction from Public Meeting Articles"
impresso/CLEF-HIPE-2020-eval
Everything to reproduce the CLEF HIPE 2020 campaign results.
michaelkinfu/hknews-headline-analysis
The Hongkong News headline analysis project was conducted by the Chinese University of Hong Kong Library.
impresso/impresso-passim
This repository contains code and sample data related to running the impresso corpus through the text reuse detection software passim.
Living-with-machines/VisualisingPressDirectories
Source code for cleaning pipeline and web app pairing the Press Directories dataset with general elections results.