Skip to content

support nested CDATA #160

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
fommil opened this issue Aug 18, 2017 · 1 comment
Closed

support nested CDATA #160

fommil opened this issue Aug 18, 2017 · 1 comment
Milestone

Comments

@fommil
Copy link
Contributor

fommil commented Aug 18, 2017

according to https://en.wikipedia.org/wiki/CDATA#Nesting

A CDATA section cannot contain the string "]]>" and therefore it is not possible for a CDATA section to contain nested CDATA sections. The preferred approach to using CDATA sections for encoding text that contains the triad "]]>" is to use multiple CDATA sections by splitting each occurrence of the triad just before the ">". For example, to encode "]]>" one would write:
<![CDATA[]]]]><![CDATA[>]]>
This means that to encode "]]>" in the middle of a CDATA section, replace all occurrences of "]]>" with the following:
]]]]><![CDATA[>
This effectively stops and restarts the CDATA section.

It looks like we're not doing this in scala.xml, e.g.

      val top =
        """<?xml version="1.0" encoding="UTF-8"?>
          |<GRAND
          |    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="vast.xsd" version="2.0">
          |    <Foo><![CDATA[%s]]></Foo>
          |</GRAND>""".stripMargin

      println(PCData(top).toString)

gives

<![CDATA[<?xml version="1.0" encoding="UTF-8"?>
<GRAND
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="vast.xsd" version="2.0">
    <Foo><![CDATA[%s]]></Foo>
</GRAND>]]>
@fommil
Copy link
Contributor Author

fommil commented Aug 18, 2017

it looks like this could be supported by applying

.replaceAll("]]>", "]]]]><![CDATA[>")

in PCData.apply

Workaround is to manually remember to .safe after creating a PCData

  implicit class PCDataOps(orig: PCData) {
    def safe: PCData = PCData(orig.text.replaceAll("]]>", "]]]]><![CDATA[>"))
  }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants