Skip images whose urls start with "https://encrypted-tbn0.gstatic.com/" because we are not able to download them
Closed this issue · 0 comments
GimChuang commented
Hi, thanks for this helpful tool and the Youtube tutorial!
I am experiencing the same issue with #4
It seems that we can't download images whose url starts with "https://encrypted-tbn0.gstatic.com/".
(It will say they are downloaded, but they aren't.)
So sometimes we request say 1000 images but only 5 are downloaded.
I found this and modified the code in GoogleImageScraper.py line 103-104.
It now skips those images we are not able to download, which suits my needs.
for image in images:
#only download images that starts with http
src = image.get_attribute("src")
if(src[:4].lower() in ["http"]) and not (src.startswith("https://encrypted-tbn0.gstatic.com/")):
print("[INFO] %d. %s"%(count,image.get_attribute("src")))
image_urls.append(image.get_attribute("src"))
count +=1
break