/google-images-scraper

A command-line-based Google images scraper to automate downloading high-quality large datasets for machine learning engineers, Built using Selenium WebDriver & urllib.request.

Primary LanguagePythonMIT LicenseMIT

google-images-scraper

⚒️ Built using

  • Selenium WebDriver
  • urllib.request

⚠️ Prerequisites

⚙️ Installation

  1. Download the project

    Debian GNU/Linux Bash

    wget -O google-images-scraper-master.zip https://github.com/yussuf-codes/google-images-scraper/archive/master.zip
    sudo apt install unzip
    unzip google-images-scraper-master.zip
  2. Navigate to source directory

    cd ./google-images-scraper-master/src/
  3. Create a virtual environment

    Debian GNU/Linux Bash

    python3.10 -m venv .venv

    Windows PowerShell

    python -m venv .venv
  4. Activate the virtual environment

    Debian GNU/Linux Bash

    source ./.venv/bin/activate

    Windows PowerShell

    Set-ExecutionPolicy -ExecutionPolicy Unrestricted -Scope CurrentUser
    .\.venv\Scripts\Activate.ps1
  5. Install the requirements

    pip install -r ./google_images_scraper/requirements.txt

🚀 Run

Debian GNU/Linux Bash

python3.10 main.py -q <SEARCH_QUERY> -n <NUMBER_OF_IMAGES> -d <IMAGES_DIRECTORY>

Windows PowerShell

python main.py -q <SEARCH_QUERY> -n <NUMBER_OF_IMAGES> -d <IMAGES_DIRECTORY>

:octocat: Repository Structure

.
├── LICENSE
├── README.md
└── src
    ├── google-images-scraper
    │   ├── __init__.py
    │   ├── implementation.py
    │   └── requirements.txt
    └── main.py

📄 License

Distributed under the MIT License. See LICENSE for more information.

❤️ Show your support

Please ⭐️ this repository if this project helped you!