orangecoding/fredy

Use residential proxies for Immoscout

kami4ka opened this issue · 4 comments

I'd suggest an enhancement for Immoscout scraping.
As for my observation, the standard ScrapingAnt proxies are unstable in the scope of the detection.

My suggestion is the following:

  1. Try using standard proxy
  2. If detected - retry with residential

Also, as an alternative, retry using standard proxies can be added before using residential:

  1. Try using standard proxy
  2. If detected - retry using standard N times
  3. if still detected - retry with residential

Residential request costs more but looks like it is cheaper than retry with standard proxies.

That's a good idea. I have to add some failsafes nonetheless. However using residential Proxies eat up the credits for free users in about 4 calls, something Theisen must be aware of

@orangecoding Oh. Not really for 4 calls :-)
40 calls for a free plan, but yep, you're right, it's a costly solution.

@kami4ka is there any hint in json I'm getting back from scrapingant that we have hit a capture?

As we can observe - yep.

  1. Currently blocked HTML content contains a following title tag: <title>Ich bin kein Roboter - ImmobilienScout24</title>
  2. Headline block contains the following text: Ich bin kein Roboter

I guess that one of those can be used for the simple detection of the capture.
Here is an example of the detected page: https://gist.github.com/kami4ka/efd1ed05c940c1eb549e172ca1b557fd