imagefap crash
Closed this issue · 6 comments
Traceback (most recent call last):
File "gallery_get.py", line 351, in run
self.run_internal()
File "gallery_get.py", line 340, in run_internal
info.path = safe_url(info.redirect, info.path)
File "gallery_get.py", line 77, in safe_url
if not link.lower().startswith("http"):
AttributeError: 'int' object has no attribute 'lower'
Using params: ['gallery_get.py']
This is safe_url():
def safe_url(parent, link):
print("link: %s" % str(link)) # debug
print("parent: %s" % str(parent)) # debug
if not link.lower().startswith("http"):
uri=urlparse(parent)
root = '{uri.scheme}://{uri.netloc}/'.format(uri=uri)
if link.startswith("//"):
link = "%s:%s" % (uri.scheme, link)
elif link.startswith("/") or root.strip('/').lower() == parent.strip('/').lower():
link = root + link
else:
link = os.path.dirname(parent) + "/" + link
return link.replace("&","&")
How do I modify this to test if 'link' is a string? The value passed is "60" (which, I assume, is a number)?
Also, it dies repeatably on one image (image 130; see below). Is there a place I can catch fatal errors and just move on to the next image?
Skipping existing file: https://cdn.imagefap.com/images/full/54/286/286565791.jpg?end=1587922289&secure=08d27343eb208234232de
link: https://cdn.imagefap.com/images/full/54/165/1653609634.jpg?end=1587922292&secure=009b663d146a5cf4c218e
parent: http://www.imagefap.com/photo/1653609634/?pgid=&gid=5242434&page=5&idx=128
Skipping existing file: https://cdn.imagefap.com/images/full/54/165/1653609634.jpg?end=1587922292&secure=009b663d146a5cf4c218e
link: https://cdn.imagefap.com/images/full/54/186/1862544636.jpg?end=1587922293&secure=002d1e42301a17af16f7b
parent: http://www.imagefap.com/photo/1862544636/?pgid=&gid=5242434&page=5&idx=129
Skipping existing file: https://cdn.imagefap.com/images/full/54/186/1862544636.jpg?end=1587922293&secure=002d1e42301a17af16f7b
link: 60
parent: http://www.imagefap.com/photo/30267529/?pgid=&gid=5242434&page=5&idx=130
Hi, can you try my changes (see above commit) and let me know if that works?
I DL'd the latest GG and ran it using python3 on CentOS (i.e., python3) and I seem to be missing a library:
Traceback (most recent call last):
File "gallery_get.py", line 451, in run_wrapped
root = GalleryGet(myurl, dest or DEST_ROOT, titleAsFolder, allowGenericPlugin).run()
File "gallery_get.py", line 442, in run
return self.queue_jobs(page, root, subtitle)
File "gallery_get.py", line 382, in queue_jobs
link = safe_url(self.url, link)
File "gallery_get.py", line 72, in safe_url
if not (isinstance(link, unicode) or isinstance(link, str)):
NameError: name 'unicode' is not defined
Using params: ['https://www.imagefap.com/pictures/5242434/Amateur-set-337-347', '/home/user8/sets', False]
And if I use Python 2, it doesn't work (I routinely use 3, so I don't know if this is new behavior):
$ python gallery_get.py
Input URL: https://www.imagefap.com/pictures/5242434/Amateur-set-337-347
Destination (/home/user8/sets):
Using imagefap plugin...
Crawling http://www.imagefap.com/pictures/5242434/Amateur-set-337-347?page=0
Crawling http://www.imagefap.com/pictures/5242434/Amateur-set-337-347?page=1
Crawling http://www.imagefap.com/pictures/5242434/Amateur-set-337-347?page=2
Crawling http://www.imagefap.com/pictures/5242434/Amateur-set-337-347?page=3
Crawling http://www.imagefap.com/pictures/5242434/Amateur-set-337-347?page=4
Crawling http://www.imagefap.com/pictures/5242434/Amateur-set-337-347?page=5
Crawling http://www.imagefap.com/pictures/5242434/Amateur-set-337-347?page=6
Crawling http://www.imagefap.com/pictures/5242434/Amateur-set-337-347?page=7
Crawling http://www.imagefap.com/pictures/5242434/Amateur-set-337-347?page=8
Crawling http://www.imagefap.com/pictures/5242434/Amateur-set-337-347?page=9
Crawling http://www.imagefap.com/pictures/5242434/Amateur-set-337-347?page=10
Crawling http://www.imagefap.com/pictures/5242434/Amateur-set-337-347?page=11
Crawling http://www.imagefap.com/pictures/5242434/Amateur-set-337-347?page=12
Crawling http://www.imagefap.com/pictures/5242434/Amateur-set-337-347?page=13
Crawling http://www.imagefap.com/pictures/5242434/Amateur-set-337-347?page=14
Crawling http://www.imagefap.com/pictures/5242434/Amateur-set-337-347?page=15
Crawling http://www.imagefap.com/pictures/5242434/Amateur-set-337-347?page=16
ERROR: Failed to copy https://cdn.imagefap.com/images/full/54/155/1552461067.jpg?end=1588106699&secure=059d6fa8e743da599e958
ERROR: Failed to copy https://cdn.imagefap.com/images/full/54/137/1370300173.jpg?end=1588106699&secure=00bc34f55d8a5e1df8a04
ERROR: Failed to copy https://cdn.imagefap.com/images/full/54/193/193540673.jpg?end=1588106701&secure=0d593868eacfa38ad76d7
ERROR: Failed to copy https://cdn.imagefap.com/images/full/54/698/698406796.jpg?end=1588106701&secure=0a3fc97bba854e59b70ef
ERROR: Failed to copy https://cdn.imagefap.com/images/full/54/186/1868846219.jpg?end=1588106703&secure=06e483e5b0cb7b6d55b1b
ERROR: Failed to copy https://cdn.imagefap.com/images/full/54/104/1048211316.jpg?end=1588106703&secure=0f2721304c360f6985488
ERROR: Failed to copy https://cdn.imagefap.com/images/full/54/143/1431985785.jpg?end=1588106704&secure=08dd9b391c5e5c3286f42
ERROR: Failed to copy https://cdn.imagefap.com/images/full/54/172/1721515185.jpg?end=1588106705&secure=0aaab830b637608ab2d2d
ERROR: Failed to copy https://cdn.imagefap.com/images/full/54/153/1536504824.jpg?end=1588106706&secure=0b9f658556a8e247ca36a
ERROR: Failed to copy https://cdn.imagefap.com/images/full/54/126/1260285408.jpg?end=1588106706&secure=01cc1601291842d4241e7
Actually, it looks like the Python3 issue was that str
now implies unicode
and that keyword is retired (or something), so I changed line 72 to: if not (isinstance(link, str) or isinstance(link, str)):
(I guess I could have removed the redundancy, but I don't actually know what you were getting at) and now (using Python3) I get a previous error about not having the lower method:
Traceback (most recent call last):
File "gallery_get.py", line 336, in run
self.run_internal()
File "gallery_get.py", line 322, in run_internal
self.process_redirect_page(info, response)
File "gallery_get.py", line 298, in process_redirect_page
(info.path,info.subtitle) = safe_unpack(jpegs[0],info.subtitle)
File "gallery_get.py", line 67, in safe_unpack
return (obj[0],safe_str(obj[1]))
File "gallery_get.py", line 58, in safe_str
name = name.replace(":",";") # to preserve emoticons
AttributeError: 'int' object has no attribute 'replace'
Using params: ['gallery_get.py']
Ok, I don't know if this is super robust, but I can get GG to skip a file by wrapping safe_unpack() in a try .. except block. I gather that I should be catching specific errors, in which case you'd want to change except:
to except AttributeError:
This seems to also close issues like #63 (i.e., with this fix, I was able to DL that URL).
This is all silent. I'm not told that the file is being skipped... I'd like to be told, but I don't see how I'd add that. Maybe in the except block?
def safe_unpack(obj, default):
if is_str(obj):
return (obj,safe_str(default))
elif obj:
try:
return (obj[0],safe_str(obj[1]))
except:
return ("","")
else:
return ("","")
Thanks for digging into this! After looking into it some more, I've taken your suggested try...except AttributeError
approach, but strictly within safe_url(). And yes, this should resolve issue #63 as well.
Would you mind trying the latest changes?
It's been a week and I'm fairly confident this is working now, closing issue. Feel free to reopen if needed!