Expectations around line ending behavior for U+2028 and U+2029

Richard Gibson richard.gibson at gmail.com
Thu Oct 25 04:09:21 UTC 2018


The only explicit mention of line numbers in the spec is to note that
"<CR><LF>… should be considered a single SourceCharacter for the purpose of
reporting line numbers", but it's clear from things like ASI and
termination of single-line comments that every *LineTerminatorSequence* is
equal in this sense. Editors and HTML are free to do what they want, but in
my opinion ECMAScript tooling at least should not pretend that these input
elements don't terminate lines.

On Wed, Oct 24, 2018 at 3:58 PM Logan Smyth <loganfsmyth at gmail.com> wrote:

> Something I've recently realized just how much U+2028 and U+2029 being
> newlines introduces a mismatch between different parts of a dev
> environment, and I'm curious for thoughts.
>
> Engines understandable take these two characters into account when
> defining their line number offsets in stack traces, since they are part of
> the LineTerminator grammar. Similarly, Babel's parser and I assume others
> will do the same and take then into account for their line number data. On
> the other hand, it seems like every editor that I've looked at so far will
> not render these characters as newlines, which can create confusion for
> users because error messages will not align with what they see in their
> editors. This seems like a burden for editors, since they would need to
> know the type of file in order to know how to render it. There's also a
> question of mixed content. If I have an HTML file with a <script>, would an
> editor need to be content-aware to render the newlines correctly only
> within the <script> tag, since U+2028/29 are not newline characters for
> HTML?
>
> Another case that comes to mind is that sourcemaps don't appear to specify
> what counts as a line. While mappings are defined per-line, it's not clear
> whether these should take U+2028/29 into account or not, though I'd assume
> the intention is /\r?\n/. Tooling like Babel will currently take U+2028/29
> into account because otherwise we'd need two independent concepts of
> line/column number for each location. That said, this Babel behavior is
> likely a bad idea because it means the application of a sourcemap would
> need to be aware of whether a given mapping within a file applies to JS
> content, or something else.
>
> Would it be worth exploring a definition of U+2028/29 in the spec such
> that they behave as line terminators for ASI, but otherwise do not
> increment things line number counts and behave as whitespace characters? If
> not, what are your thoughts on the issues I've mentioned?
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20181025/d74ebdda/attachment.html>


More information about the es-discuss mailing list