UnsplashDownload

Build Status

0x00 Introduction

Unsplash

Make something awesome

Unsplash

Beautiful, free images and photos that you can download and use for any project. Better than any royalty free or stock photos.

0x01 Target

Get links to all images on the Unsplah specified page and download them locally

0x02 Requirements

forthebadge

  • scrapy - An open source and collaborative framework for extracting the data you need from websites
  • threadpool - Easy to use object-oriented thread pool framework

0x03 Run

1.Clone the repository and enter the file root directory

git clone https://github.com/Explore-Space/UnsplashDownload.git
cd UnsplashDownload/

2.Install requirements

pip3 install -r requirements.txt

3.Run scrapy to get picture links and store them in the database

scrapy crawl unsplash

The database file generated after running is saved in database

4.Run multithreaded Downloader

python3 download.py

Pictures downloaded after running are saved in images

0x04 Note

1.Scrapy gets links to the first three pages of Unsplash images by default. If needed, modify the value of lines 21 and 22 of scrapy_unsplash/spiders/unsplash.py

2.Multithreaded downloader runs 50 threads by default. If needed, modify the value of line 39 of download.py

0x05 Test

Test Environment:

  • OS: Ubuntu 18.04
  • Python version: 3.6.8
  • Scrapy version: 1.6
  • Last test time : 2019-07-08

0x06 Acknowledgements

Project reference: