Pinned Repositories
ArchiveBot
ArchiveBot, an IRC bot for archiving websites
grab-site
The archivist's web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns
IA.BAK
We back up a lot of stuff from around the web; now it's time to back up the Internet Archive, just in case.
NewsGrabber
Grabbing all news.
parler-grab
Archiving Parler.
seesaw-kit
Making a reusable toolkit for writing seesaw scripts
terroroftinytown
URLTeam's second generation of URL shortener archiving tools
warrior-dockerfile
A Dockerfile for the ArchiveTeam Warrior
wget-lua
Wget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.
wpull
Wget-compatible web downloader and crawler.
Archive Team's Repositories
ArchiveTeam/ArchiveBot
ArchiveBot, an IRC bot for archiving websites
ArchiveTeam/usgovernment-grab
Archiving parts of the US government.
ArchiveTeam/warrior4-vm
Warrior virtual machine appliance (version 4)
ArchiveTeam/urls-grab
Archiving URLs (outlinks) from a variety of sources.
ArchiveTeam/megawarc
Nondestructive warc-in-tar to warc conversion
ArchiveTeam/archiveteam-megawarc-factory
Some scripts to process ArchiveTeam uploads
ArchiveTeam/goo-gl-grab
Archiving goo.gl, the Google URL Shortener.
ArchiveTeam/warrior-hq
Warrior HQ API
ArchiveTeam/goo-gl-items
Managing items for goo-gl-grab.
ArchiveTeam/voiceofamerica-grab
Archiving Voice of America (all sites).
ArchiveTeam/voiceofamerica-items
Managing items for voiceofamerica-grab.
ArchiveTeam/zhubai-grab
Archiving 竹白 (Zhubai).
ArchiveTeam/arzon-grab
Archiving arzon.jp.
ArchiveTeam/arzon-items
Managing items for arzon-grab.
ArchiveTeam/dailymotion-grab
Archiving Dailymotion.
ArchiveTeam/dailymotion-items
Managing items for dailymotion-grab.
ArchiveTeam/grab-tools
Tools for *-grab projects.
ArchiveTeam/ipcc-items
Managing items for ipcc-grab.
ArchiveTeam/ipccdata-grab
Archiving ipcc-data.org.
ArchiveTeam/livestream-grab
Archiving livestream.com.
ArchiveTeam/mahonoiland-grab
Archiving 魔法のiらんど (Maho-no iLand).
ArchiveTeam/plimit
Redis based parallelism limiter
ArchiveTeam/radiofreeeurope-grab
Archiving Radio Free Europe/Radio Liberty (aka RFERL aka RFE/RL)
ArchiveTeam/radiofreeeurope-items
Managing items for radiofreeeurope-grab.
ArchiveTeam/retrospring-grab
Archiving Retrospring.
ArchiveTeam/sonetblog-grab
Archiving SSブログ" (So-net Blog).
ArchiveTeam/torrentgalaxy-grab
Archiving TorrentGalaxy.
ArchiveTeam/torrentgalaxy-items
Managing items for torrentgalaxy-grab.
ArchiveTeam/usgovernment-items
Managing items for usgovernment-grab.
ArchiveTeam/zhubai-items
Managing items for zhubai-grab.