/AGAC

Python script to scrap images from Google Arts & Culture.

Primary LanguagePythonMIT LicenseMIT

Archiving Google Arts & Culture

Python script to scrap images from Google Arts & Culture. This is a request from r/DataHoarder.

Requirements

  • Python 3
  • Beautiful Soup 4.6.0

Install

pip install -r requirements.txt

Usage

It should work on pages that have window.INIT_data.

python agac.py <url> [<output folder>]
python agac.py "https://www.google.com/culturalinstitute/beta/"

How does it work?

It loads the html and parses window.INIT_data in the <script>. Then, it uses imghdr to determine file extension if not provided.