alex at weej.com
Thu Sep 29 09:09:34 UTC 2016
In XHTML, CDATA allows a 'more' verbatim spelling of text node content. But
the end token has to be escaped, as discussed. Despite this escaping, the
text node can contain arbitrary strings.
In XHTML, you *can* achieve the same effect without CDATA, just by escaping
XML entities. Again, and cruciallt, the text node can contain arbitrary
In HTML without CDATA, using HTML entities within the script tag is wrong
specifically because they are *not* interpreted. The text node in the HTML
document CANNOT contain arbitrary strings, and there is no further decode
step before the JS parser hits your code, so you're forced to take other
measures to ensure that `</script>` does not appear in your code. There are
a few places this can appear, only one of which is embedded in string
literals, so the method of avoiding this is actually sensitive to
the context and not practical to specify.
I hope you can appreciate how ridiculous this problem is for HTML - I don't
believe CDATA support in HTML 5 can solve this due to forward compatibility -
which is why it's an antipattern. Just don't do it, or use XHTML. It's not
cool to hate on XML anymore. ;)
On Thursday, 29 September 2016, Mike Samuel <mikesamuel at gmail.com> wrote:
> Without CDATA you have to encode script bodies properly. With CDATA you
> have to encode script bodies properly. What problem did CDATA solve?
> On Sep 28, 2016 8:03 PM, "Alexander Jones" <alex at weej.com
>> They do solve the problem. You encode your entire JS *before* pasting it,
>> encoding `]]>` and nothing more, and the XML document's text node contains
>> the unadulterated text, which the JS parser also sees. It's perfect layer
>> isolation. Ye olde HTML can't do that because there is no escaping
>> mechanism for `</script>` that actually allows the JS parser to see the
>> text (code) content unmodified.
>> Viva la `<xhtml:revolución />` ;)
>> On Wednesday, 28 September 2016, Mike Samuel <mikesamuel at gmail.com
>>> I agree it's subideal which is why I work to address problems like this
>>> in template systems but ad-hoc string concatenation happens and embeddable
>>> sub-languages provide defense-in-depth without sacrificing correctness.
>>> CDATA sections solve no problems because they cannot contain any string
>>> that has "]]>" as a substring so you still have to s/\]\]>/]]>]]<!CDATA>/g.
>>> On Sep 28, 2016 2:32 PM, "Alexander Jones" <alex at weej.com> wrote:
>>>> That's awful. As you say, it's an antipattern, no further effort should
>>>> than slapping directly into a script tag unencoded, so no-one else should
>>>> have to see this. Also, there are many other producers of JSON than
>>>> Instead, use XHTML and CDATA (which has a straightforward encoding
>>>> mechanism that doesn't ruin the parseability of the code or affect it in
>>>> any way) if you really want to pull stunts like this.
>>>> On Wednesday, 28 September 2016, Michał Wadas <michalwadas at gmail.com>
>>>>> Idea: require implementations to stringify "</script>" as
>>>>> Benefits: remove XSS vulnerability when injecting JSON as content of
>>>>> <script> tag (quite common antipattern).
>>>>> Backward compatible: yes, unless binary equality is required and this
>>>>> string is used.
>>>> es-discuss mailing list
>>>> es-discuss at mozilla.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the es-discuss