5 June 2014 TC39 Meeting Notes

Mark S. Miller erights at google.com
Fri Jun 13 09:31:36 PDT 2014

Does there exist any string where an old browser using old rules would
decide that a <module> is closed at one place, but a new browser following
the rules you propose would decide that the <module> is closed at a
different place?

On Fri, Jun 13, 2014 at 9:15 AM, Domenic Denicola <
domenic at domenicdenicola.com> wrote:

> Thanks Scott; much appreciated.
> IMO it would be a good universe where `<module>` had the following things
> `<script>` has:
> - Does not require escaping < > & ' " in any contexts.
> - Terminates when seeing `</module` + extra chars. (Possibly we could do
> this only when it would otherwise be a parsing error, to avoid `"</mod" +
> "ule>"` grossness? But that would require some intertwingling of the HTML
> and ES parsers, which I can imagine implementers disliking.)
> But it removes the following things `<script>` has:
> - `<!--` escaped data mode and double-escaped mode
> - \r, \r\n, \0 special-casing
> - The two new single-line comment forms (maybe; I know these work in Node
> though, so maybe just leave them in as part of the ES6 spec).
> Although I know some people think making `<script>` and `<module>` have
> different rules would be confusing for authors, IMO this would be a nice
> authoring experience.
> ________________________________________
> From: cananian at gmail.com <cananian at gmail.com> on behalf of C. Scott
> Ananian <ecmascript at cscott.net>
> Sent: Friday, June 13, 2014 12:06
> To: Domenic Denicola
> Cc: Mark S. Miller; es-discuss; Ben Newman
> Subject: Re: 5 June 2014 TC39 Meeting Notes
> On Thu, Jun 12, 2014 at 11:11 AM, Domenic Denicola
> <domenic at domenicdenicola.com> wrote:
> > I guess part of it is clarifying which part of "<script>'s insane parsing
> > rules" we're talking about. From what I'm aware of there are quite a lot
> of
> > different insanities; but I am fuzzy on the details. Does anyone know
> which
> > rules are inherently necessary, and which are historical accidents or
> > constraints?
> I'll recap the rules for "script data state" from
> http://www.whatwg.org/specs/web-apps/current-work/multipage/tokenization.html#script-data-state
> As a general rule, `\r` and `\r\n` are converted to `\n`, and `\0` is
> not allowed.
> The case-insensitive sequence `</script` followed by a character in `[
> \t\r\n\f/>]` terminates the script data section.
> (These constraints would be present for HTML-embedding.)
> In addition, the exact character sequence `<!--` switches to "escaped
> data" parsing.  This is a bit hairy, and you can even end up in
> "double escaped" modes.  See
> http://stackoverflow.com/questions/23727025/script-double-escaped-state
> for an example.  Presumably these are the "insane parsing rules" under
> discussion.  You are encouraged to try to follow the logic in the
> WHATWG spec yourself. ;)
> In addition, [Web EcmaScript](http://javascript.spec.whatwg.org/)
> introduces two new single line comment forms: `<!--` must be treated
> as if it were `//`, and `-->` (with some crazy start-of-line
> restrictions) is also treated as a single line comment.
> To some degree the line between the HTML parser and Web EcmaScript is
> movable; currently the HTML parser recognizes the `<!--` etc tokens
> but pushes them into the data section of the script tag anyway; one
> could just as easily imagine the HTML parser doing all the work and
> stripping the "new comment forms" from the token stream.
>   --scott

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20140613/9261850e/attachment-0001.html>

More information about the es-discuss mailing list