proycon/foliapy

foliavalidator doesn't preserve CDATA ?

kosloot opened this issue · 2 comments

when running foliavalidator examples/gaps.2.0.0.folia.xml -o
the CDATA block in the <content> is discarded.

IN:

     <gap class="frontmatter">
        <desc>This is the cover of the book</desc>
        <content>
<![CDATA[

            SHOW WHITE AND THE SEVEN DWARFS


                by the Brothers Grimm

                    first edition


            Copyright(c) blah blah
]]>
        </content>
     </gap>

OUT

    <gap class="frontmatter">
      <desc>This is the cover of the book</desc>
      <content>


            SHOW WHITE AND THE SEVEN DWARFS


                by the Brothers Grimm

                    first edition


            Copyright(c) blah blah

        </content>
    </gap>

This might not be a real bug, but is quite surprising.

This is a good point yes and something to investigate. If I recall correctly I have rather limited CDATA support, there could be situations where this becomes a problem.

In the lxml binding, there is this paramter which is enabled by default and causes this behaviour:

  • strip_cdata - replace CDATA sections by normal text content (on by default)