alexadam/save-as-ebook

Capturing unneeded elements

Opened this issue · 4 comments

This plugin usually captures some unneeded elements for me so far. For example it works terrible with scribblehub. If there is even one comment in the comment section the main content is completely ignored and only the comments are captured. When that happens epub starts with the string "Error: Parse Error:".

Even when the main content is captured there are some unneeded elements capture both before and after the necessary content. It would be nice if we can specify which elements are going to be captured or not, preferably by using Xpath expression of the needed elements.

I can't reproduce it. Please send the link that's causing problems

Screenshot 2020-11-12 at 17 26 49

Ok it seems the bug only exists in firefox extension. Chrome gave the epub like it's suppose to.

A method to actually select the html element to capture would be nice. It would be great in cases where the single html element spans muliple webpages in which case it's not possible to select all the text at once. Go to 24symbols.com and try a free book for example. The save page option can capture a chapter almost perfectly, save for an unwanted footer at the end of each chapter (which is still great because it's actually inside an iframe and the footers can be removed easily afterwards). But the save selection method fails spectacularly in this case (chapter spans multiple pages even though the entire chapter gets loaded in each page).

On a side note, I'm really grateful if you can answer this question. How is 24symbols preventing us from accessing the page source of the webpages of the books? (what it gives is completely a different page source)

Ok here is a webpage I saved from 24symbols (with SingleFile plugin),

Aftershock - A Stone Braide Chronicles Story by Bonnie S. Calhoun - Read book online (11_24_2020 12_25_01 PM).zip

The book was just something that used as guinea pig I still don't have any idea what's it about!
The page source can be viewed from this file. Which is not the case when I try it directly at the site.
The entire chapter is there in the page source but only a part of it's visible from the webpage thus it's impossible to select it all from Save Selection option.