fiduswriter/diffDOM

Problem processing the following html

nexon33 opened this issue · 7 comments

When I calculate the diff using the following code snippets I run into some trouble, for some reason multiline strings are not handled accordingly. I first calculate the diff and then apply the diff on an empty div.

when I use

<div data-interchange="[/some/link, (default)],
                                [/some/other/link, (medium)]">
</div>

I get the error

DOMException: Failed to execute 'setAttribute' on 'Element': '(default)],' is not a valid attribute name. and when I use

<div data-interchange="
        [/some/link, (default)],
                                [/some/other/link, (medium)]">
</div>

I get

DOMException: Failed to execute 'setAttribute' on 'Element': '[' is not a valid attribute name.

but when I use

<div data-interchange="[/some/link, (default)], [/some/other/link, (medium)]">
</div>

then I get no error.

Here is an example using JSfiddle, as you can see the diff contains invalid actions, more specifically you can see how it detects [ as an attribute which later on causes the previously mentioned error:

{"nodeName":"div","attributes":{"[":"","(default)],":"","data-uuid":"interchange-llv1q00ub"},"childNodes":[{"nodeName":"#text","data":"\n "}]},{"nodeName":"#text","data":"\n "}]}}

https://jsfiddle.net/pbf35dov/9/

I found some other code that doesn't get diffed correctly;

 <style id="style">
         .test {
        background: url('data:image/svg+xml;utf8,<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 48 48"><path d="M17.17 32.92l9.17-9.17-9.17-9.17L20 11.75l12 12-12 12z"/><path d="M0-.25h48v48H0z" fill="none"/><\/svg>') 50% 51% no-repeat;
        -webkit-background-size: 24px 24px;
        background-size: 24px 24px;
        height: 56px;
        position: relative;
        text-align: right
    }
    </style>
produces a malformed diff because of the < > characters: 

[
    {
        "action": "modifyAttribute",
        "route": [],
        "name": "id",
        "oldValue": "style",
        "newValue": "copiedstyle"
    },
    {
        "action": "modifyTextElement",
        "route": [
            0
        ],
        "oldValue": "\n         .test {\n        background: url('data:image/svg+xml;utf8,",
        "newValue": "\n\n    "
    },
    {
        "action": "removeElement",
        "route": [
            1
        ],
        "element": {
            "nodeName": "svg",
            "attributes": {
                "xmlns": "http://www.w3.org/2000/svg",
                "viewBox": "0 0 48 48"
            },
            "childNodes": [
                {
                    "nodeName": "path",
                    "attributes": {
                        "d": "M17.17 32.92l9.17-9.17-9.17-9.17L20 11.75l12 12-12 12z"
                    }
                },
                {
                    "nodeName": "path",
                    "attributes": {
                        "d": "M0-.25h48v48H0z",
                        "fill": "none"
                    }
                }
            ]
        }
    }
]

I'll make a jsFiddle tomorrow as its pretty late

Hey @nexon33 ,
do you actually need to diff those elements? In that case, why don't you just diff document.getElementById('ifr').contentDocument.body instead? Reading HTML is difficult and there will always be issues with very complex HTML even if we get this fixed. Reading from the DOM element does not have those same issues.

Hey @nexon33 ,
do you actually need to diff those elements? In that case, why don't you just diff document.getElementById('ifr').contentDocument.body instead? Reading HTML is difficult and there will always be issues with very complex HTML even if we get this fixed. Reading from the DOM element does not have those same issues.

If I diff on documentbody.body then (assuming the style tag is inside the body) i still get the same problem. I really like the library and think its awesome. I'd help with the library if I could and in fact might look into actually doing that.

@nexon33 You mean document.body?

In general, the HTML parsing issues have all been fixable until now, so this one is probably as well. It's a matter of spending the time needed on that. I am currently very busy with conferences, etc., so if you can work on it, that would be very much appreciated!

dav-q commented

Recently I encountered an error similar to this:

Environment

diff-dom : 5.0.7

Behaviour

I have two strings and I want to compare them using the following code:

dd = new diffDOM.DiffDOM()
const elementA = document.createElement('div');
const elementB = document.createElement('div');
// The following strings are returned from the API, suppose you can't control this
elementA.innerHTML = `<span>Random text</span>`;
elementB.innerHTML =  `<ul><li><span data-random-attr='{"event":"mom\'s birthday"}'>Random text</span></li></ul>`;

After creating a diff ( diff = dd.diff(elementA, elementB) ), the node looks like this:

image

Finally, after applying the diff ( dd.apply(elementA, diff) ), the error happens:

Uncaught DOMException: Failed to execute 'setAttribute' on 'Element': 'birthday"}'' is not a valid attribute name.

So, The error happens because the node created in the elementB is wrong caused by special characters.

A partial solution could be to sanitize the strings using dompurify and the error will disappear

Maybe this can help someone