/scrapy-pipelines

A collection of pipelines for Scrapy

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

Read more: noffle/art-of-readme: Learn the art of writing quality READMEs.

Scrapy-Pipelines

Overview

CII Best Practices

pylint Score

Travis branch Coverage Report codebeat badge https://api.codacy.com/project/badge/Grade/aeda92e058434a9eb2e8b0512a02235f Updates Known Vulnerabilities Code style: black License: AGPL v3

Since Scrapy doesn't provide enough pipelines examples for different backends or databases, this repository provides severals to demostrate the decent usages, including:

  • MongoDB
  • Redis (todo)
  • InfluxDB (todo)
  • LevelDB (todo)

And also these pipelines provide multiple ways to save or update the items, and return id created by backends

Requirements

Python 3
  • Python 3.6+
  • Works on Linux, Windows, Mac OSX

Installation

PyPI PyPI - Python Version PyPI - Wheel

The quick way:

pip install scrapy-pipelines

For more details see the installation section in the documentation: https://scrapy-pipelines.readthedocs.io/en/latest/intro/installation.html

Documentation

Documentation is available online at https://scrapy-pipelines.readthedocs.io/en/latest/ and in the docs directory.

Community (blog, twitter, mail list, IRC)

Keeping this section same as Scrapy is intending to benefit back to Scrapy.

See https://scrapy.org/community/

Contributing

Keeping this section same as Scrapy is intending to be easier when this repo merge back to Scrapy.

See https://doc.scrapy.org/en/master/contributing.html

Code of Conduct

Please note that this project is released with a Contributor Code of Conduct (see https://github.com/scrapy/scrapy/blob/master/CODE_OF_CONDUCT.md).

By participating in this project you agree to abide by its terms. Please report unacceptable behavior to opensource@scrapinghub.com.

Companies using Scrapy

Keeping this section same as Scrapy is intending to benefit back to Scrapy.

See https://scrapy.org/companies/

Commercial Support

Keeping this section same as Scrapy is intending to benefit back to Scrapy.

See https://scrapy.org/support/

TODO

  • [X] Add indexes creation in open_spider()
  • [X] Add item_completed method
  • [X] Add signals for MongoDB document's id return
  • [ ] Add MongoDB document update
  • [ ] Add Percona Server for MongoDB docker support
  • [ ] Add Redis support
  • [ ] Add InfluxDB support
  • [ ] Add LevelDB support