/cfcrawler

Cloudflare scraper and cralwer written in Async, In-place library for HTTPX. Crawl website that has cloudflare enabled, easier than ever!

Primary LanguagePythonMIT LicenseMIT

cfcrawler

Release Build status codecov Commit activity License

Cloudflare scraper and cralwer written in Async, In-place library for HTTPX. Crawl website that has cloudflare enabled, easier than ever!

Getting started

To use library, simply replace your aiohttp client with ours!

from cfcrawler import AsyncClient

async def get(url):
    client = AsyncClient()
    await client.get(url)

You can also rotate user agents

from cfcrawler import AsyncClient

client = AsyncClient()
client.rotate_useragent()

You can also specify which browser you want to use

from cfcrawler.types import Browser
from cfcrawler import AsyncClient

AsyncClient(browser=Browser.CHROME)

You can also use asyncer to syncify the implementation

from cfcrawler import AsyncClient
from asyncer import syncify

def get(url):
    client = AsyncClient()
    syncify(client.get)(url)

Coming Next

  1. CF JS Challenge solver
  2. Captcha solver integration (2Captcha and etc)

Contribution

I'll work on this library in few months, I don't have free time right now, but feel free to contribute. I'll check and test the PRs myself!