Prado Museum's Project

Scrape data and images from the PRADO MUSEUM website to build a dataset for a Generative Adversarial Network.

Website: https://www.museodelprado.es/coleccion/obras-de-arte

Steps:

Download in ascending and descending order to overcome the 10,000 limit of the pagination (...normalize/canonicalize URLs and remove duplicates)

sh get_pages.sh

python parse_pages.py

sh get_works.sh

python parse_works.py

salvacarrion/prado-downloader