fix x3vid filenames (remove colons) because Windows is fragile
Opened this issue · 3 comments
GG gets images from x3vid.com "full image" pages, but the filenames I see on Windows are goofy because (I think) they have colons in them. [I don't know if this affects all filenames on x3vid. FWIW, the URL is: https://x3vid.com/gallery_pics/3424569/Public_nudity_35?page=1]
The HTML snippet is below and I see GG saving the image to the name used by the website as-is, but Windows displays this image as 'HQEQHE~F.JPG'. OTOH, Chrome removes the colon (silently) and then the filename is sensible. Any clues how I could hack GG to also silently remove the colons?
<a href="/i42727683/Public_nudity_35?page=1&source=gallery">
<figure>
<img id="42727683" data-p="1" class="img-box thumb" alt="Public nudity 35 (1/10)" src="/images/14242/https:__ep5.xhcdn.com_000_146_605_573_1000.jpg" />
</figure>
</a>
This one-line change (below, find the line marked "added THIS LINE") seems to be working. My first attempt to modify the write_to_file() method caused copy_image() to crash. I'm still not sure why.
def copy_image(self, info):
info.attempts += 1
file_name = info.destination_filename()
file_name = re.sub(r"[:]", "", file_name) # added THIS LINE
try:
file_info = urlopen_safe(info.path)
except:
return False
try:
modtimestr = file_info.headers['last-modified']
modtime = time.strptime(modtimestr, '%a, %d %b %Y %H:%M:%S %Z')
except:
modtime = None
if self.can_skip(file_name, file_info):
print("Skipping existing file: " + info.path)
return True
if info.attempts == 1:
print("%s -> %s" % (info.path, file_name))
if not info.write_to_file(file_info, file_name):
return False
if modtime is not None:
lastmod = calendar.timegm(modtime)
os.utime(file_name, (lastmod, lastmod))
return os.path.getsize(file_name) > 4096
If x3vid.com doesn't use filenames worth preserving, I would recommend this instead:
- go to the gallery_plugins directory, make a copy of plugins_generic.py and call it plugins_x3vid.py (keep it in that same folder)
- change the last line in plugins_x3vid.py to
same_filename = False
Now when you re-run it should say "Using x3vid plugin", and you should get filenames that look like 001.jpg, 002.jpg, etc.
If you're happy with the result, feel free to open a pull request with your addition of the plugin!
This is a good suggestion, but I have a ticket about how limited that functionality is. It recycles the numbers across pages on some sites. (I posted a patch that fixes this, but it's probably kludgy; for example, if there are two threads, the numbering starts from 3 0003, 0004, ... and I don't know why).
I was looking for a way to just remove characters that Windows considers illegal in filenames and I wish it was easier to add things to the plugins so I can add remove_colons = true
in a plug-in and then add code to GG to handle that special feature of a website.
A significant problem is that I can code in other languages, but I'm almost completely ignorant of python3...