Pinned Repositories
twarc
A command line tool (and Python library) for archiving Twitter JSON
anon
tweet about anonymous Wikipedia edits from particular IP address ranges
etudier
Extract a citation network from Google Scholar
feediverse
Send RSS/Atom feeds to Mastodon
id
LCSH SKOS webapp
microdata
python library for extracting html microdata
pymarc
process MARC records from Python
wikichanges
a NodeJS library for monitoring changes on Wikipedia sites
wikistream
displays edit activity on wikipedia
edsu's Repositories
edsu/feediverse
Send RSS/Atom feeds to Mastodon
edsu/memento-cli
A command line utility for listing and searching snapshots in web archives
edsu/browsertricky
A helper to run browsertrix-crawler locally
edsu/notebooks
Some random Jupyter notebooks.
edsu/diary
Silly GPT-n experiment
edsu/aotycount
Count albums in AOTY list of lists
edsu/idloc
Get JSON-LD for a Library of Congress name or subject authority.
edsu/bin
Some small command line things I use
edsu/foiaonline
edsu/inkdroid.org
My website
edsu/pywb
Core Python Web Archiving Toolkit for replay and recording of web archives
edsu/WarcDB
WarcDB: Web crawl data as SQLite databases.
edsu/airwaves
Unlocking the Airwaves
edsu/bangs
Repository of bangs used by Kagi Search
edsu/browsertrix-behaviors
Automated behaviors that run in browser to interact with complex sites automatically. Used by ArchiveWeb.page and Browsertrix Crawler.
edsu/browsertrix-crawler
Run a high-fidelity browser-based crawler in a single Docker container
edsu/dotfiles
My dotfiles
edsu/example-btrix-behavior
An example of trying to run browsertrix crawler with a custom behavior.
edsu/genuary
Some #genuary experiments
edsu/geoserver-publish
Simple client for publishing Shapefiles and GeoTIFFs to Geoserver.
edsu/lc-sdf-data-exploration
edsu/leaflet-geoserver-example
'nuff said
edsu/nvim
My nvim configuration
edsu/ocfl-extensions
OCFL Community Extensions
edsu/quarto-map
An example of a dynamic map in Quarto
edsu/scoop-witness-api
A simple REST API for witnessing the web using the Scoop web archiving capture engine.
edsu/sqlite-migrate
A simple database migration system for SQLite, based on sqlite-utils
edsu/warc-gpt
WARC + AI - Experimental Retrieval Augmented Generation Pipeline for Web Archive Collections.
edsu/wayback
A Python API to the Internet Archive Wayback Machine
edsu/whisper
Robust Speech Recognition via Large-Scale Weak Supervision