Take a look at the process of bypassing CAPTCHAs when collecting public data from Amazon with Amazon Scraper API (one-week free trial). You can find the full guide on our blog.
This scraper will likely encounter a CAPTCHA.
import requests
custom_headers = {
"Accept-language": "en-GB,en;q=0.9",
"Accept-Encoding": "gzip, deflate, br",
"Cache-Control": "max-age=0",
"Connection": "keep-alive",
"User-agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.1 Safari/605.1.15",
}
url = "https://www.amazon.com/SAMSUNG-Border-Less-Compatible-Adjustable-LS24AG302NNXZA/dp/B096N2MV3H?ref_=Oct_DLandingS_D_fe3953dd_2"
response = requests.get(url, headers=custom_headers)
with open('with_captcha.html', 'w') as file:
file.write(response.text)
The API is designed to avoid CAPTCHAs.
import requests
from pprint import pprint
payload = {
'source': 'amazon',
'url': 'https://www.amazon.com/dp/B096N2MV3H',
'parse': True
}
response = requests.request(
'POST',
'https://realtime.oxylabs.io/v1/queries',
auth=('username', 'password'),
json=payload,
)
pprint(response.json())
with open('without_captcha.json', 'w') as file:
file.write(response.text)
Follow our technical documentation for all available API parameters.
In case of any issues, please contact us at support@oxylabs.io
Looking to scrape more other Amazon data? Amazon Review Scraper, Amazon ASIN Scraper, How to Scrape Amazon Prices, Scraping Amazon Product Data