/Image-Downloader

Download images from Google, Bing, Baidu. 谷歌、百度、必应图片下载.

Primary LanguagePythonMIT LicenseMIT

Image Downloader

996.icu

1. Introdoction

Crawl and download images using Selenium + PhantomJS Using python3 and PyQt5

2. Key features

  • Supported Search Engine: Google, Bing, Baidu
  • Keywords input from keyboard, or input from line seperated keywords list file for batch process.
  • Download image using customizable number of threads.
  • Fully supported conditional search (eg. filetype:, site:).
  • Switch for Google safe mode.
  • Proxy configuration (socks, http).
  • CMD and GUI ways of using are provided.
  • Windows prebuilt executable release from release page.

3. Solve dependencies

3.1 Windows

3.1.1 Download and install Python3.5

Download Latest version of Python3.5 installer from here

3.1.2 Download and install PyQt5

Download latest version of PyQt5 install from here

3.1.3 Download and setup phantomjs

Official phantomjs prebuilt executable can be downloaded from here

Then copy phantomjs.exe to ${project_directory}/bin/

3.1.4 Install python packages

pip3.exe install -r requirements.txt

3.1.5 Build one-file .exe bundle

pip3.exe install pyinstaller
mkdir bin

copy the downloaded phantomjs.exe from 3.1.3 into ./bin folder.

pyinstaller image_downloader_gui.spec

The bundle will be built in ./dist folder.

3.2 Linux

3.2.1 Install dependent packages

apt-get install python3-pip python3-pyqt5 pyqt5-dev-tools

3.2.2 Download and setup phantomjs

  • For PC users

Official phantomjs prebuilt executable can be downloaded from here
[Warning]: PhantomJS installed from ubuntu source by apt-get do not work in this project.

  • For Raspberry Pi Users

Unofficial phantomjs prebuilt executable or .deb for raspberry pi can be downloaded from here

Add the path of phantomjs executable to $PATH, or simply copy it to /usr/local/bin/.

3.2.3 Install python packages

pip3 install -r requirements.txt

4. Usage

4.1 GUI

4.2 CMD

usage: image_downloader.py [-h] [--engine {Google,Bing,Baidu}]
                           [--max-number MAX_NUMBER]
                           [--num-threads NUM_THREADS] [--timeout TIMEOUT]
                           [--output OUTPUT] [--safe-mode] [--face-only]
                           [--proxy_http PROXY_HTTP]
                           [--proxy_socks5 PROXY_SOCKS5]
                           keywords

License

  • MIT License
  • 996ICU License