/asyncrawler

Asyncio based web crawling framework

Primary LanguagePythonGNU Lesser General Public License v3.0LGPL-3.0

Features

  • Asynchronous downloading using aiohttp
  • Downloads cached locally in sqlite
  • Continue an interrupted crawl
  • Proxies
  • Cookies
  • Handle redirects
  • Retry 5XX errors

Example

>>> import asyncrawler
...

Install

Install from pypi:

pip install asyncrawler

Or checkout latest version from repository:

git clone https://github.com/richardpenman/asyncrawler