web-archives
There are 27 repositories under web-archives topic.
webrecorder/pywb
Core Python Web Archiving Toolkit for replay and recording of web archives
webrecorder/warcio
Streaming WARC/ARC library for fast web archive IO
oldweb-today/oldweb-today
Browse emulated browsers connected to old web sites in your browser!
cocrawler/cdx_toolkit
A toolkit for CDX indices such as Common Crawl and the Internet Archive's Wayback Machine
N0taN3rd/node-warc
Parse And Create Web ARChive (WARC) files with node.js
zytedata/web-snap
Create "perfect" snapshots of web pages
archivesunleashed/notebooks
Various examples of notebooks for working with web archives with the Archives Unleashed Toolkit, and derivatives generated by the Archives Unleashed Toolkit.
lanl/Zotero-Robust-Links-Extension
Create Robust Links from within Zotero
oduwsdl/MementoEmbed
A service that provides archive-aware oEmbed-compatible embeddable surrogates (social cards, thumbnails, etc.) for archived web pages (mementos).
oduwsdl/raintale
A Python utility for publishing a social media story built from archived web pages to multiple services.
sebastian-nagel/warc-crawler
Process web archives (WARC format) with StormCrawler and index content into Elasticsearch or Solr
ukwa/ukwa-gsheets-utils
Add-On for Google Sheets to help those working with web archives.
hrbrmstr/cdx
🕸 Query Web Archive Crawl Indexes ('CDX')
caltechlibrary/eprints2archives
Send records from an EPrints server to the Internet Archive and other web archives
wsdookadr/warctools
warc tools allowing joining, finding missing resources, fetching missing resources, accessing metadata, conversion to zim and offline viewing for web archives
ukwa/waybacks
This module builds our Waybacks in the various different configurations we require.
bhouston1982/staticPages-webArchives
Python scripts to generate static navigation pages from collection list and insert Web Archives records using the Archive-It CDX
helgeho/Tempas2ArchiveSpark
ArchiveSpark DataSpec to analyze the Internet Archive's Web archive through temporal search results returned by Tempas (v2)
oduwsdl/offtopic-goldstandard-data
Data for testing the Offtopic detection software
shadowctrl/Palaceradio
PalaceRadio | A Next.js app Built from web Archive | Freelance Project @upwork
ukwa/ukwa-ui
A new user interface for the UK Web Archive
web-archive-group/wadl2017
WADL2017 Web Archive Group team papers
k12stemaker/k12stemaker.github.io
宁波凯思奥教育科技有限公司
nchylak/capstone-project
A collection of the scripts and notebooks I wrote as part of my Data Science Bootcamp capstone project
N0taN3rd/node-cdxj
Parse CDXJ(https://github.com/oduwsdl/ORS/wiki/CDXJ) files with node.js
shadowctrl/Farsky
Farsky | A Next.js app Built from web Archive | Freelance Project @upwork
tigercosmos/web-archives
Web Archives Collection System