ChrisMuir/Zillow

Capcha is immediate and impossible to solve

lionhive opened this issue · 7 comments

The crawler runs for me, but capcha comes up immediately, and it's very hard to solve. It almost seems like they're using a capcha that is designed to just waste time and not be solveable. Anyone else seeing this problem? Occasionally I can pass the captcha and get some data, but this is very hard to achieve.

Hi @lionhive

Please see issues #9 and #13. I've seen this in the past, where the CAPTCHA would just reload continuously and never disappear, but it was quite rare. Are you saying this is happening to you most/all of the times the CAPTCHA appears?

I too see this as immediate. I tried solving all of them (which took a long time), but after I verified it, it immediately threw up another CAPTCHA. I think its getting smarter unfortunately :(

I have the same problems, once I verified it, it immediately threw up another CAPTCHA. Have you solved this problems??

@MathrewLing This is a known issue that I'm not planning on trying to fix. I've essentially walked away from this repo. The top of the README includes a note indicating this.

@ChrisMuir Thanks for your reply. I'll keep trying. Thank you anyway.

This is happening to me when I hit the site via Chrome (91.0.4472.114 (Official Build) (64-bit)). I am not running a scraper, this is regular old manual access. I was doing fine for 10-20 minutes, just poking around looking at what was available, and then it just locked up on me.

Brought up Firefox and it seems to work OK. No CAPTCHA issues. Weird.

SOLVED!!!!!!!!!

You need to make sure cookies can be saved. This got me passed the CAPTCHA for me. It has to be a fully qualified path or Chrome complains.

[Example]

sel_path = os.path.join(os.getcwd(), 'selenium')
chrome_options = Options()
chrome_options.add_argument("user-data-dir="+ sel_path)
chrome_options.add_argument("user-data-dir=selenium")
driver = webdriver.Chrome(chrome_options=chrome_options)
driver.get(zillow_path)