web-archive
There are 42 repositories under web-archive topic.
dosyago/dn
💾 dn - offline full-text search and archiving for your Chromium-based browser.
Ray-D-Song/web-archive
Free web archiving and sharing service based on Cloudflare. 跑在 Cloudflare 上的免费网页归档和分享工具。
webrecorder/replayweb.page
Serverless replay of web archives directly in the browser
webrecorder/browsertrix
Browsertrix is the hosted, high-fidelity, browser-based crawling service from Webrecorder designed to make web archiving easier and more accessible for all!
devanshbatham/ArchiveFuzz
Hunt down the secrets from the WebArchives for Fun and Profit
Own-Data-Privateer/hoardy-web
Passively capture, archive, and hoard your web browsing history, including the contents of the pages you visit, for later offline viewing, replay, mirroring, data scraping, and/or indexing. Your own personal private Wayback Machine that can also archive HTTP POST requests and responses, as well as most other HTTP-level data.
internetarchive/cdx-summary
Summarize web archive capture index (CDX) files.
TarekJor/bookmark-archiver
🗄 Save an archived copy of websites from Pocket/Pinboard/Bookmarks/RSS. Outputs HTML, PDFs, and more...
webis-de/archive-query-log
📜 The Archive Query Log.
ShaunLWM/ark
🚢 A self-hosted, personal archival application
antiufo/Shaman.Dokan.Warc
Mounts WARC files on Windows
YGGverse/YGGo
YGGo! Distributed Web Search Engine
anjackson/sliver
A tool for collection archival slivers of the web and web archives
oduwsdl/MementoMap
A Tool to Summarize Web Archive Holdings
ghobs91/Chronicl
Decentralized web archiver that distributes archives across Nostr relays
swve/gitstorykit
Build rich git projects history discovery apps with ease, used by Gitstory
minch-dev/DownTheMoon
A continuation of legacy XUL version of DownThemAll! ✔️preserves web.archive.org timestamps, ✔️advanced filters for remote directory tree mirroring, ✔️UI is tweaked for better UX
ysdn-info/ysdn.info
An archive of the York/Sheridan Design Program
bottomless-archive-project/java-warc
Read Web ARChive (WARC) files in Java.
q-m/replayweb.page-docker
Docker image for ReplayWeb.page
ArtificialOSS/WebCrawl
Crawls the web to generate a huge dataset for training
ibnesayeed/utils
Miscellaneous utility scripts
india-ultimate/the-huddle
A mirror of The Huddle magazine
laxika/java-warc
Read Web ARChive (WARC) files in Java.
thiagolopes/alexandria
Backup and save websites
grey-land/warc-browser
a cli toolkit for working with web archives
wdhdev/web-archiver
Easily scrape, download and preview websites.
AndreMor8/wubbzy-sites
Wubbzy archived sites
paulmelnikow/wabac
A versioned cache backed by cloud storage
sergio11/retrospect
Retrospect 🔍 is a cybersecurity tool that analyzes historical web snapshots 🕒 from the Wayback Machine, uncovering vulnerabilities 🛡️, sensitive data leaks 🔓, and security misconfigurations 🛠️. It empowers security pros to predict and mitigate threats ⚠️ before they become exploitable.
shadowctrl/Palaceradio
PalaceRadio | A Next.js app Built from web Archive | Freelance Project @upwork
wayback-if-down/wayback-if-down.github.io
Redirect to a live website or an archived version if it's down.
meadowingc/waybacker
Periodically crawl a set of websites and ensure that all of their pages are archived on the Wayback Machine. Mirror of https://codeberg.org/meadowingc/waybacker
s5-dev/archiver
Tool to archive websites and other content available on the Internet on the content-addressed S5 Network
extua/wacksy
An experimental library for reading and writing WACZ files
shadowctrl/Farsky
Farsky | A Next.js app Built from web Archive | Freelance Project @upwork