Finding Epub Images
Roznoshchik opened this issue · 0 comments
Roznoshchik commented
Epubs seem to have very limited consistency with how they organize their internal file structure.
I haven't figured out a great way of finding the image folder.
images = soup.find_all('img')
if images:
for img in images:
img["loading"] = "lazy"
filename = img['src']
filename = filename.replace("../", path+"/")
if not os.path.exists(filename):
filename = f"{path}/{img['src']}"
if not os.path.exists(filename):
filename = f"{path}/EPUB/media/{img['src']}"
if not os.path.exists(filename):
filename = f"{path}/EPUB/images/{img['src']}"
if not os.path.exists(filename):
filename = img['src']
filename = filename.replace("../", path+"/OEBPS/")
Whenever I encounter an epub whose images don't load, I need to load up the epub, look at the folder structure and then manually add in the branching path.
I'm sure there's a better way to search the epub to locate the image folder itself which would work for any yet undiscovered filepaths.