History Lab
We turn documents into data and develop tools to explore history.
Columbia University in the City of New York
Pinned Repositories
archiving-digital-records
Course materials for the Summer '23 Archiving Digital Records workshop
cabinet
foiarchive-api
REST API for Freedom of Information Archive (FOIArchive)
foiarchive-postgres
Scripts, configuration and examples for the PostgREST proof of concept
mosaic-llm-db
Describes access to History Lab data for Mosaic LLM project
pdf2mbox
a command-line utility and Python package for converting PDF emails to MBOX format
piir-eval
Framework for PII redaction evaluation
state_names
Scraper for state department consular names and positions
topic-model-updates
Scripts for updating corpus-specific topic models in the FOIArchive database.
xmpdf
A Python module for extracting emails from a PDF.
History Lab 's Repositories
history-lab/pdf2mbox
a command-line utility and Python package for converting PDF emails to MBOX format
history-lab/archiving-digital-records
Course materials for the Summer '23 Archiving Digital Records workshop
history-lab/foiarchive-api
REST API for Freedom of Information Archive (FOIArchive)
history-lab/xmpdf
A Python module for extracting emails from a PDF.
history-lab/cabinet
history-lab/foiarchive-postgres
Scripts, configuration and examples for the PostgREST proof of concept
history-lab/historylab-parser
history-lab/piir-eval
Framework for PII redaction evaluation
history-lab/history-lab-bookworm
history-lab/mosaic-llm-db
Describes access to History Lab data for Mosaic LLM project
history-lab/state_names
Scraper for state department consular names and positions
history-lab/topic-model-updates
Scripts for updating corpus-specific topic models in the FOIArchive database.
history-lab/topic_models_using_rosetta
history-lab/un-archives-db-schema
Database schema objects for UN Archives metadata and text
history-lab/covid19-muckrock
Scripts for preprocessing and loading of metadata and text for the History Lab-Muckrock COVID-19 Collection
history-lab/foiarchive-api-python-example
Example of querying the FOIArchive REST API via a Python program
history-lab/foiarchive2csv
SQL scripts for dumping FOIArchive data to CSV
history-lab/HL-Fall-23
Research project investigating OCR evaluation mechanisms at Columbia's History Lab.
history-lab/HL-Spring-24