/DanbooruSpider

A general purpose image spider based on Danbooru API.

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

DanbooruSpider

An efficient crawler based on Python asynchronous, can act on multiple image websites that use Danbooru as the backend

This document is available under Chinese

GitHub license GitHub issues GitHub stars GitHub forks Python Version

Advantages

Universal

  • Currently supports the following multiple sites
  • This program considers access to other download interfaces from the beginning of the design, and only a small amount of code can add new site access

Efficient

This program uses Python's asynchronous programming features, which can maximize resource utilization

  • HTTP requests completely use httpx as an asynchronous to efficiently drive the program to run
  • On the author's own Visual Studio Codespace:
    • Running average download speed up to 20MiB/s in default configuration
    • The memory footprint is less than or equal to 200MiB

Reliable

Since the project was founded, pylance and mypy have been used for code type and code format checking, and pydantic has been used as a model for dynamic type verification.

Deployment

The deployment and use of this project is very simple:

Ready to work

  • Python3.8 or higher

  • Complete Python standard library

  • Save the project code locally

Installation dependencies

Open the project folder and execute the command line

pip insall -r requirements.txt

Run

python3 main.py

Configuration

For details, please see the comments in Configuration File