Athlon1600/SerpScraper

Google give fake 200 HTTP CODE

nsetiono opened this issue · 1 comments

Hello,

I'm also finding that Google sending 200 HTTP CODE but is not the real one, see this results

[1] => Array
        (
            [url] => https://www.google.co.id/search?gl=id&q=domain+murah&oq=domain+murah&start=0&client=psy-ab&gbv=1&complete=0&num=100&pws=0&nfpr=1&ie=utf-8&oe=utf-8
            [referer] => https://www.google.com/sorry/index?continue=https://www.google.co.id/search%3Fgl%3Did%26q%3Ddomain%2Bmurah%26oq%3Ddomain%2Bmurah%26start%3D0%26client%3Dpsy-ab%26gbv%3D1%26complete%3D0%26num%3D100%26pws%3D0%26nfpr%3D1%26ie%3Dutf-8%26oe%3Dutf-8&q=EgRrmMAfGLGDj_oFIhkA8aeDS5MOwTuCq_WmvtVEq7jIxMQJPHNLMgFy
            [keyword] => domain murah
            [handle] => Resource id #146
            [error] => 
            [status] => 200
            [proxy] => 
            [proxyauth] => 
            [proxytype] => CURLPROXY_HTTP
            [useragent] => Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_5) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/12.1.1 Safari/605.1.15
            [cookies] => D:\htdocs\cookiefiles\gtc-b27198b55927f24de32de20c8c2d9e36.txt
            [body] => <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head><meta http-equiv="content-type" content="text/html; charset=utf-8"><meta name="viewport" content="initial-scale=1"><title>https://www.google.com/sorry/index</title></head>
<body style="font-family: arial, sans-serif; background-color: #fff; color: #000; padding:20px; font-size:18px;" onload="e=document.getElementById('captcha');if(e){e.focus();}">
<div style="max-width:400px;">
<hr noshade size="1" style="color:#ccc; background-color:#ccc;"><br>
<div style="font-size:13px;">
Our systems have detected unusual traffic from your computer network.  Please try your request again later.  <a href="#" onclick="document.getElementById('infoDiv0').style.display='block';">Why did this happen?</a><br><br>
<div id="infoDiv0" style="display:none; background-color:#eee; padding:10px; margin:0 0 15px 0; line-height:1.4em;">
This page appears when Google automatically detects requests coming from your computer network which appear to be in violation of the <a href="//www.google.com/policies/terms/">Terms of Service</a>. The block will expire shortly after those requests stop.<br><br>This traffic may have been sent by malicious software, a browser plug-in, or a script that sends automated requests.  If you share your network connection, ask your administrator for help &mdash; a different computer using the same IP address may be responsible.  <a href="//support.google.com/websearch/answer/86640">Learn more</a><br><br>Sometimes you may see this page if you are using advanced terms that robots are known to use, or sending requests very quickly.
</div><br>

IP address: 107.152.192.31<br>Time: 2020-08-24T13:36:13Z<br>URL: https://www.google.com/sorry/index<br>
</div>
</div>
</body>
</html>

even the status is 200 but google giving wrong page

Where is this array coming from? Because $google->search() should return a SearchResponse object which SHOULD hold a correct a correct status code... that 200 is coming from something in your own code.