jOOQ/jOOX

Match.content() doesn't return <![CDATA[ ]]> wrapper

Closed this issue · 3 comments

content() ignores CDATA wrappers.

For instance:

<ALink id="URL1"><![CDATA[http://www.google.com]]></ALink>

will return

http://www.google.com

You're right. The behaviour is inconsistent between the following two calls:

// Doesn't print the <![CDATA[ ... ]]> wrapper:
$("<ALink id=\"URL1\"><![CDATA[http://www.google.com]]></ALink>").content()

// Prints a <![CDATA[ ... ]]> wrapper
$("<ALink id=\"URL1\"><a><![CDATA[http://www.google.com]]></a></ALink>").content()

This is because of a flawed Impl.content(Element) implementation:

        // The element contains only text
        else if (!Util.hasElementNodes(children)) { // *** should really check for exact node type
            return element.getTextContent();
        }

        // The element contains content
        else {
            // TODO: Check this code's efficiency
            String name = element.getTagName();
            return Util.toString(element).replaceAll("^<" + name + "(?:[^>]*)>(.*)</" + name + ">$", "$1");
        }

I can confirm that the latest works as expected. Thanks!