Python script to scrap images from Google Arts & Culture. This is a request from r/DataHoarder.
- Python 3
- Beautiful Soup 4.6.0
pip install -r requirements.txt
It should work on pages that have window.INIT_data
.
python agac.py <url> [<output folder>]
python agac.py "https://www.google.com/culturalinstitute/beta/"
It loads the html and parses window.INIT_data
in the <script>
. Then, it uses imghdr to determine file extension if not provided.