Skip images whose urls start with "https://encrypted-tbn0.gstatic.com/" because we are not able to download them

Question

Skip images whose urls start with "https://encrypted-tbn0.gstatic.com/" because we are not able to download them

Closed this issue 3 years ago · 0 comments

Hi, thanks for this helpful tool and the Youtube tutorial!
I am experiencing the same issue with #4
It seems that we can't download images whose url starts with "https://encrypted-tbn0.gstatic.com/".
(It will say they are downloaded, but they aren't.)
So sometimes we request say 1000 images but only 5 are downloaded.
I found this and modified the code in GoogleImageScraper.py line 103-104.
It now skips those images we are not able to download, which suits my needs.

for image in images:
                    #only download images that starts with http
                    src = image.get_attribute("src")
                    if(src[:4].lower() in ["http"]) and not (src.startswith("https://encrypted-tbn0.gstatic.com/")):
                        print("[INFO] %d. %s"%(count,image.get_attribute("src")))
                        image_urls.append(image.get_attribute("src"))
                        count +=1
                        break