Sigil-Ebook/PageEdit

A few notes about pre-beta PageEdit

BeckyDTP opened this issue · 9 comments

  1. Custom CSS file (custom_webview_style.css) not work.

  2. In Preferences is "GeneralSettings", should be "General Settings"

  3. PageEdit inserts the nbsp entity unnecessarily after manually formatting (<b>, <i>, <u>, <strike>, <sub>, <sup>).
    Try write this sentence in PageEdit:
    Sigil is [Ctrl-I]the best[Ctrl-I] EPUB file editor.
    Save and look at Code View.
    Before EPUB is nbsp:
    <p>Sigil is <i>the best</i>_EPUB file editor.</p>

  4. PageEdit inserts is="http://www.w3.org/1999/xhtml" for headings (h1-h6). Is it normal?
    <h1 is="http://www.w3.org/1999/xhtml">TEST</h1>

  5. Try this:

  • After last paragraph insert:
    <hr class="sigil_split_marker"/>
  • Write something AFTER this <hr>
  • Save and look at Sigil --> the text you enter will be without any tags.
  • Back to PageEdit
  • Insert Enter at the end of the text and write something again.
  • Save and look at Sigil --> now a new text will be in <div>'s
  1. <div>'s are often inserted: for example after bulleted list, sometimes even randomly.
    Step by step:
    Write (don't copy/paste):
    aaa[Enter]
    [press bulleted list icon]xxx[Enter]
    yyy[Enter]
    zzz[Enter][Enter]
    bbb

Result:
<p>aaa</p>
<p></p>
<ul><li>xxx</li>
<li>yyy</li>
<li>zzz</li>
</ul>
<div>bbb</div>
<p></p>

Empty <p></p> and <div>'s I can easily remove in code (manually or by plugin), but text without any tags is a problem.

I assume that most of these problems do not concern PageEdit itself, but rather QtWebEngine shortcomings.

  1. Needs to be in PageEdit'ts preference directory not Sigil's and this did work at one time.
  2. Will fix that for the full release, okay as is for beta
  3. This in just how WebEngine Works. No special code does this in PageEdit
  4. Will modify the call to gumbo to remove those added attributes.
  5. Yes the split marker can be inserted into markup by he user and mess things up. Again post processing by gumbo can fix these. But bare text inside the body tag has always been possible in both WebKit and WebEngine. Real browsers do not care, only epubcheck cares.
    Typically you highlight a list of paragraphs and then insert list. Hitting enter twice in a row can and will sometimes insert a div.

This is just how live dom editing works. This is why we removed BookView in the first place. Any code generated by PageEdit should be carefully examined either using the Inspector or when back in CodeView and fixed.

Pt.1, Fix now in PageEdit master, local variable covered class variable bug.
Tested and injection now works. Also now removes any injected css link when saving file

Pt.2, Fixed now in master

Pt. 4, Added code to remove any of these added "is" attributes on h1-h6 tags before saving file

FWIW, both QtWebKit and QtWebEngine make a lot of use of nbsp since they are not collapsed like other whitespace. If you try you can hit the spacebar many times in a row, and PageEdit and BookView will use nbsp to prevent the whitespace from being collapsed.

It is easy to see and check this using the web inspector (hit the inspect element tool once the Inspector has come up, and then click on the word/element you are interested in PageView to see the actual html used. You can also use Inspector to edit any text in html or view mode to change nbsp to something else.

The key here is that nbsp are crucial to wysiwyg editing as they are used to format text when no css is available to do what you want. This is not something we have any control over at all.

Same for swapping div vs p tags.

The one thing we can do is look for bare text as a direct child of the body tag and wrap them in p tags when serializing the contents of PageEdit.

We could add a routine that before saving the text to a file replaces all nbsp with normal spaces but this could actually hurt pre-existing nbsp that were present for a reason.

And of course we could remove empty p or div tags in post processing as well.

Okay, I spent a lot of time last night studying this and found out that we can control how WebEngine (and many other browsers) decide to insert nbsp when contenteditable is true. According to the web spec on the official browser editing api, the css property white-space: pre-wrap should be set on whatever elements we want to edit.

In our case that would be everything but the html, head, and body tags. The reason you do not apply this to the body tag is that it then forces double-spacing.

So in your custom_webview_style.css please add the following:

:not(html):not(head):not(body) { white-space: pre-wrap; }

and then try the tests you have given above. You should not find nbsp anymore (unless they were in the page to begin with) as the editing software in WebEngine can now use normal spaces even for strings of blanks.

If you can confirm this "fixes" the bulk of the added by Qt nbsp, we can inject this stylesheet into head and then remove it when we save the file so things just work for most users.

In my testing this works quite well. So I went ahead and pushed this "fix" to master as well as the use of prettyprint to clean up the text before the file is saved.

If this causes any issues after the beta, we can make this a user configurable option, but from what I can tell, this prevents the injections by QtWebEngine of all the nbsp every place.

I have new build.
I confirm, that 1, 2, 3 and 4 are fixed. Thank you.
5 and 6 – require more tests. I understand that many of the problems can be solved by gumbo, but it's probably worth waiting for user feedback during beta tests.

As we have received no real feedback from beta testers, I am going to close this issue. Feel free to create a new specific issue when you are done with your tests.