Expectations around line ending behavior for U+2028 and U+2029

Logan Smyth loganfsmyth at gmail.com
Thu Oct 25 16:24:00 UTC 2018


Yeah, *LineTerminatorSequence* is definitely the canonical definition of
line numbers in JS at the moment. As we explore
https://github.com/tc39/proposal-error-stacks, it would be good to clearly
specify how a line number is computed from the original source. As
currently specified, a line number in a stack trace takes U+2028/29 into
account, and thus requires any consumer of this source code and line number
value needs to have a special case for JS code. It seems unrealistic to
expect every piece of tooling that works with source code would have a
special case for JS code to take these 2 characters into account. Given
that, the choices are

1. Every tool that manipulates source code needs to know what type so it
can special-case JS it is in order to process line-related information
2. Every tool should consider U+2028/29 newlines, causing line numbers to
be off in other programming languages
2. Accept that tooling and the spec will never correspond and the use of
these two characters in source code will continue to cause issues
3. Diverge the definition of current source-code line from the current
*LineTerminatorSequence* lexical grammar such that source line number is
always /\r?\n/, which is what the user is realistically going to see in
their editor




On Wed, Oct 24, 2018 at 9:09 PM Richard Gibson <richard.gibson at gmail.com>
wrote:

> The only explicit mention of line numbers in the spec is to note that
> "<CR><LF>… should be considered a single SourceCharacter for the purpose of
> reporting line numbers", but it's clear from things like ASI and
> termination of single-line comments that every *LineTerminatorSequence* is
> equal in this sense. Editors and HTML are free to do what they want, but in
> my opinion ECMAScript tooling at least should not pretend that these input
> elements don't terminate lines.
>
> On Wed, Oct 24, 2018 at 3:58 PM Logan Smyth <loganfsmyth at gmail.com> wrote:
>
>> Something I've recently realized just how much U+2028 and U+2029 being
>> newlines introduces a mismatch between different parts of a dev
>> environment, and I'm curious for thoughts.
>>
>> Engines understandable take these two characters into account when
>> defining their line number offsets in stack traces, since they are part of
>> the LineTerminator grammar. Similarly, Babel's parser and I assume others
>> will do the same and take then into account for their line number data. On
>> the other hand, it seems like every editor that I've looked at so far will
>> not render these characters as newlines, which can create confusion for
>> users because error messages will not align with what they see in their
>> editors. This seems like a burden for editors, since they would need to
>> know the type of file in order to know how to render it. There's also a
>> question of mixed content. If I have an HTML file with a <script>, would an
>> editor need to be content-aware to render the newlines correctly only
>> within the <script> tag, since U+2028/29 are not newline characters for
>> HTML?
>>
>> Another case that comes to mind is that sourcemaps don't appear to
>> specify what counts as a line. While mappings are defined per-line, it's
>> not clear whether these should take U+2028/29 into account or not, though
>> I'd assume the intention is /\r?\n/. Tooling like Babel will currently take
>> U+2028/29 into account because otherwise we'd need two independent concepts
>> of line/column number for each location. That said, this Babel behavior is
>> likely a bad idea because it means the application of a sourcemap would
>> need to be aware of whether a given mapping within a file applies to JS
>> content, or something else.
>>
>> Would it be worth exploring a definition of U+2028/29 in the spec such
>> that they behave as line terminators for ASI, but otherwise do not
>> increment things line number counts and behave as whitespace characters? If
>> not, what are your thoughts on the issues I've mentioned?
>> _______________________________________________
>> es-discuss mailing list
>> es-discuss at mozilla.org
>> https://mail.mozilla.org/listinfo/es-discuss
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20181025/d1600e2a/attachment-0001.html>


More information about the es-discuss mailing list