/wayback

A Python script that bulk uploads pages to the Internet Archive via a specified sitemap

Primary LanguageJupyter Notebook

wayback

A Python script that bulk uploads pages to the Internet Archive via a specified sitemap

Credit

Inspired by this article: https://www.holisticseo.digital/python-seo/internet-archive/

I only made a slight variation, using advertools to pull pages from a sitemap and to_list() over apply() and a lambda function when extracting URLs to iterate over. Purely a preference thing.

Also shout out to Elias Dabbas for the advertools library which has made sitemap handling in Python so much easier.