mxrch/GHunt

Error while retrieving photos in wait.until(element_has_substring_or_substring((By.XPATH, "//body")

mahdi1234 opened this issue ยท 20 comments

With b1a2a56

Google Photos : XXX
Traceback (most recent call last):              
  File "hunt.py", line 114, in <module>
    gpics(gaiaID, client, cookies, cfg)
  File "/GHunt/lib/photos.py", line 97, in gpics
    out = get_source(gaiaID, client, cookies, cfg)
  File "/GHunt/lib/photos.py", line 82, in get_source
    result = wait.until(element_has_substring_or_substring((By.XPATH, "//body"), photos_trigger, no_photos_trigger))
  File "/home/xxx/.local/lib/python3.7/site-packages/selenium/webdriver/support/wait.py", line 80, in until
    raise TimeoutException(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException: Message: 

mxrch commented

I'll fix it but I want to know, how is your internet connection ? Good or not ? How much mo/s ? Because I think 30 seconds timeout is amply sufficient..

@mxrch I have 100mbit cable, no issue with internet, set timeout to 300s and still fails, could it be due to large photo gallery?

Data are incoming on 400KiB/s only (while reviews go 3.5 MiB/s).

bmon

mxrch commented

@mahdi1234 no it can't be that, if you have albums it will break the wait.
Either it doesn't click well on the "back" button, either it loads another language than english. Otherwise, I don't know..
Can you add these lines :

open('debug.html', 'w').write(repr(driver.page_source))
print("----START DEBUG ---")
print(driver.find_elements(By.XPATH, "//button"))
print("----END DEBUG ---")

between these two lines at line 63 in lib/photos.py :

	driver.get(f"https://get.google.com/albumarchive/{gaiaID}/albums/profile-photos?hl=en")
        [HERE]
	tmprinter.out('Fetching the Google Photos albums overview...')

And upload the debug.html file somewhere & give the link or put it in a pastebin ?

Same error here.
Python: 3.8.5
Chrome: Version 83.0.4103.116 (Developer Build) built on Debian bullseye/sid, running on Debian kali-rolling (64-bit)
GHunt version: Latest (ea00a24)

The only difference is mine says the error happened on line 86, but it's probably bc of the latest commits:

File "/usr/lib/python3/dist-packages/selenium/webdriver/support/wait.py", line 86, in until
    raise TimeoutException(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException: Message:

It only happens on target accounts on wich I get the Got the albums overview ! message.

mxrch commented

I don't understand since in your debug.html @mahdi1234 you have one of the two "triggers" to detect if the page finished loading, on your file there is the string "reached the end" in the body, so it should not wait.
But I have an hypothesis, maybe the EC.text_to_be_present_in_element requires the element to not have the text in the element at the moment where this code is executed.
I left this condition to be sure but I don't think it's necessary since the page is already loaded when the "Album Archive" is in the page, so I'll try to push a fix in another branch so you can test.
(sorry if I'm slow with the issues & pull requests & fixes it's the first time I need to manage a "popular" project, I learn)

mxrch commented

@PinkDev1 @mahdi1234 I pushed a potential fix in the i10 branch (35173d0) can you checkout on it and tell me if it fixes the issue ?

Welp, it doesn't crash anymore, so I guess it's a fix!

Google Photos : https://get.google.com/albumarchive/<REDACTED>
=> Couldn't fetch the public photos.

I've noted that you can view the albumarchive, but only when logged in with the target account. From any other account, it'll just throw a 404. I'll add a disclaimer for this, since even while having the target account's password this tool can provide plenty of automated insight.

I'd also like to note that the cookies didn't change after I logged in to my spy account on another browser, outside my VM but on the same network. There's two possibilities:

  • RNG magic (extremely unlikely)
  • They didn't change bc I logged in from the same IP

So I'll add a disclaimer for this as well. Wait for my PR

mxrch commented

@PinkDev1 yes it happens sometimes, it's not happening with my emails list, but Sector035 had the same problem with his account.
I don't think it's an RNG or something to do with the logged account, but with a confidentiality setting you have set, or something else obscure..
I'll add a disclaimer about it ! Unfortunately we have to deal with it while we use this bypass, but at least it works with the great majority (I think)

mxrch commented

Don't hesitate to contact me on Twitter with your address email you are using GHunt with, I think I can't debug it but we never know

...I don't think it's an RNG or something to do with the logged account, but with a confidentiality setting you have set, or something else obscure..

That may be the case. I've tweaked a lot with the settings of my google accounts

I've noted that you can view the albumarchive, but only when logged in with the target account. From any other account, it'll just throw a 404.

Update on this: It only happens when visiting the link directly, on a browser while logged-in with the target account. But on GHunt, even when authenticated with the target's google cookies, it says Couldn't fetch the public photos. ... I'll need to do further testing

I just ran the tool and I'm getting the same crash .

I just ran the tool and I'm getting the same crash .

Well that's weird. Are you sure you downloaded the correct branch?
If unsure, delete your current GHunt and redownload with git clone -b i10 https://github.com/mxrch/GHunt.git GHunt-i10

mxrch commented

@PinkDev1 I already merged the fix with the master branch !

mxrch commented

@adik69 can you show the output ?

mxrch commented

I've noted that you can view the albumarchive, but only when logged in with the target account. From any other account, it'll just throw a 404.

Update on this: It only happens when visiting the link directly, on a browser while logged-in with the target account. But on GHunt, even when authenticated with the target's google cookies, it says Couldn't fetch the public photos. ... I'll need to do further testing

@PinkDev1 Can I have the Google ID you are trying with ? Don't worry I can't get the email from this

UPDATE: I think I found the culprit:

On account A I have tweaked my settings to harden my google account as much as possible. This account has almost all personalization disabled, and uses very few google services.
When I visited account A's albumarchive on the browser, while signed in as account A, I was able to see it's photo album, but not any option to share it with anyone (more on this later).
GHunt said:

Unable to fetch connected Google services.
...
Google Photos : https://get.google.com/albumarchive/<REDACTED>
=> Couldn't fetch the public photos.

However, on account B the settings are pretty much default, and it's prob 8~ years old. This account uses many google services and has linked many accounts.
When I visited account B's albumarchive on the browser, while signed in as account B, I was able to see it's photo album, and an option to share it with other users:

notice

Clicking on more information ->
showyourphotos
"More Information" link

GHunt said:

Activated Google services :
- Photos

Google Photos : https://get.google.com/albumarchive/<REDACTED>
=> Couldn't fetch the public photos.

It seems GHunt needs this setting to be turned on. I assume this setting has been turned on on accounts that have used Picasa, Picasa Web Albums, or the Picasa Web Albums API. My theory is that the images stored on Picasa were archived on the user's Google Photos, and then his account was automatically linked to Picasa in order to keep the service working. Therefor, GHunt will succeed on scrapping these account's albumarchive.

It's probably extremely uncommon for a user to manually go (and somehow find) their albumarchive, and turn on that setting.

See: https://support.google.com/picasa/answer/6383491

I haven't tested this myself bc AFAIK, there's no option to turn it off, and I don't want to expose my personal accounts.
Can you verify if my theory is true?

PS: No, I won't give you any google ID, I'm way too paranoid ๐Ÿ˜‚

@mxrch I confirm it fixed the issue on two accounts I had it.

Well, this issue was fixed on 35173d0, and I've already opened a PR to add a Disclaimer to the README.
@mxrch this issue can finally be closed :)