the module works equivalently to requests module. It does not help against cloudflare. see code below

Question

the module works equivalently to requests module. It does not help against cloudflare. see code below

Mcklmo opened this issue 2 years ago · 0 comments

Before creating an issue, first upgrade cfscrape with pip install -U cfscrape and see if you're still experiencing the problem. Please also confirm your Node version (node --version or nodejs --version) is version 10 or higher.

Make sure the website you're having issues with is actually using anti-bot protection by Cloudflare and not a competitor like Imperva Incapsula or Sucuri. And if you're using an anonymizing proxy, a VPN, or Tor, Cloudflare often flags those IPs and may block you or present you with a captcha as a result.

Please confirm the following statements and check the boxes before creating an issue:

I've upgraded cfscrape with pip install -U cfscrape
I'm using Node version 10 or higher
The site protection I'm having issues with is from Cloudflare
I'm not using Tor, a VPN, or an anonymizing proxy

Python version number

Run python --version and paste the output below:
Python 3.11.2

cfscrape version number

Run pip show cfscrape and paste the output below:
Name: cfscrape
Version: 2.1.1
Summary: A simple Python module to bypass Cloudflare's anti-bot page. See https://github.com/Anorov/cloudflare-scrape for more information.
Home-page: https://github.com/Anorov/cloudflare-scrape
Author: Anorov
Author-email: anorov.vorona@gmail.com
License: UNKNOWN
Location: C:\Users\mh98\AppData\Local\Programs\Python\Python311\Lib\site-packages
Requires: requests
Required-by:

Code snippet involved with the issue

from bs4 import BeautifulSoup
import cfscrape

valid = True
cnt = 0
url = 'https://www.cdp.net/en/responses?queries%5Bname%5D=nike'

# send requests until the scraper protection kicks in
while valid:
    cnt += 1
    print(cnt)

    # scrape
    scraper = cfscrape.create_scraper()
    res = scraper.get(url) 
    soup = BeautifulSoup(res.content, 'html.parser')
    table = soup.find('table', class_='sortable_table')
    
    # if protection is activated, the table will not be found. Exit loop.
    # takes approx. 40 requests
    if table == None:
        valid = False
        print('scraper protection kicked in')

Complete exception and traceback

(If the problem doesn't involve an exception being raised, leave this blank)

URL of the Cloudflare-protected page

https://www.cdp.net/en/responses?queries%5Bname%5D=nike
[LINK GOES HERE]

URL of Pastebin/Gist with HTML source of protected page

https://gist.github.com/Mcklmo/7a840a9a8c0360dd5ad04cfe4a3d1b7d
[LINK GOES HERE]