Anorov/cloudflare-scrape

site portected with cloudflare, returns 503, can not use cfscrape.get_tokens

marinalan opened this issue · 0 comments

  • [x ] I've upgraded cfscrape with pip install -U cfscrape
  • [x ] I'm using Node version 10 or higher
  • [x ] The site protection I'm having issues with is from Cloudflare
  • I'm not using Tor, a VPN, or an anonymizing proxy

Python version number

Run python --version and paste the output below:
Python 3.8.10


cfscrape version number

Run pip show cfscrape and paste the output below:

Name: cfscrape
Version: 2.1.1
Summary: A simple Python module to bypass Cloudflare's anti-bot page. See https://github.com/Anorov/cloudflare-scrape for more information.
Home-page: https://github.com/Anorov/cloudflare-scrape
Author: Anorov
Author-email: anorov.vorona@gmail.com
License: UNKNOWN
Location: /home/marina/.venvs/soup/lib/python3.8/site-packages
Requires: requests
Required-by:

Code snippet involved with the issue

import cfscrape
from scrapy import Request
url = "https://www.screencountry.com/index.php?section=products&model=THINKPAD%20X13%20YOGA%2020W8002MRA&brand=Lenovo&series=CHROMEBOOK"
token, agent = cfscrape.get_tokens(url,"Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible;")

Complete exception and traceback

(If the problem doesn't involve an exception being raised, leave this blank)

2022-05-19 18:21:20 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.screencountry.com:443
2022-05-19 18:21:20 [urllib3.connectionpool] DEBUG: https://www.screencountry.com:443 "GET /index.php?section=products&model=THINKPAD%20X13%20YOGA%2020W8002MRA&brand=Lenovo&series=CHROMEBOOK HTTP/1.1" 503 None
2022-05-19 18:21:20 [root] ERROR: 'https://www.screencountry.com/index.php?section=products&model=THINKPAD%20X13%20YOGA%2020W8002MRA&brand=Lenovo&series=CHROMEBOOK' returned an error. Could not collect tokens.
Traceback (most recent call last):
  File "/home/marina/.venvs/soup/lib/python3.8/site-packages/cfscrape/__init__.py", line 249, in solve_challenge
    javascript = re.search(r'\<script type\=\"text\/javascript\"\>\n(.*?)\<\/script\>',body, flags=re.S).group(1) # find javascript
AttributeError: 'NoneType' object has no attribute 'group'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/home/marina/.venvs/soup/lib/python3.8/site-packages/cfscrape/__init__.py", line 383, in get_tokens
    resp = scraper.get(url, **kwargs)
  File "/home/marina/.venvs/soup/lib/python3.8/site-packages/requests/sessions.py", line 542, in get
    return self.request('GET', url, **kwargs)
  File "/home/marina/.venvs/soup/lib/python3.8/site-packages/cfscrape/__init__.py", line 129, in request
    resp = self.solve_cf_challenge(resp, **kwargs)
  File "/home/marina/.venvs/soup/lib/python3.8/site-packages/cfscrape/__init__.py", line 204, in solve_cf_challenge
    answer, delay = self.solve_challenge(body, domain)
  File "/home/marina/.venvs/soup/lib/python3.8/site-packages/cfscrape/__init__.py", line 290, in solve_challenge
    raise ValueError(
ValueError: Unable to identify Cloudflare IUAM Javascript on website. Cloudflare may have changed their technique, or there may be a bug in the script.

Please read https://github.com/Anorov/cloudflare-scrape#updates, then file a bug report at https://github.com/Anorov/cloudflare-scrape/issues."

URL of the Cloudflare-protected page

[LINK GOES HERE]

URL of Pastebin/Gist with HTML source of protected page

https://www.screencountry.com/index.php?section=products&model=THINKPAD%20X13%20YOGA%2020W8002MRA&brand=Lenovo&series=CHROMEBOOK