Expectations around line ending behavior for U+2028 and U+2029

Logan Smyth loganfsmyth at gmail.com
Fri Oct 26 21:48:47 UTC 2018


Great, thank you for that resource Allen, it's helpful to have something
concrete to consider.

What you'd prefer is that that other languages should also be rendered with
U+2028/29 as creating new lines, even though their specifications do not
define them as lines? That means that any parser for these languages that
follows the language spec would them be outputting line numbers that would
potentially not correspond with the code as rendered inside of the
developer's editor, if the editor renders U+2028/29 a line separators? That
would for instance mean that Rust's single-line comments could actually be
rendered as multiple lines, even though they are a single line according to
the spec.

My frustration here isn't that the characters exist, it's just that their
behavior in a world of explicitly defined syntactic grammars that depend on
line numbers for errors and things, they seem poorly-defined, even if their
behavior in text documents may have more meaning. For instance, here is
XCode's rendering of 2028/2029
[image: Screen Shot 2018-10-26 at 2.33.56 PM.png]

2028 does seem to render as a "line separator" in that visually the code is
on a new line, but it is rendered within the same line number marker as the
start of that snippet of text. That seems to satisfy the behavior defined
by Unicode, but it's not helpful from the standpoint of code looking to
process sourcecode. Should a parser follow that definition of line
separator, since 2028 suggests rendering a new line, but since it's not a
paragraph, it's conceptually part of the same paragraph? What is a
paragraph in source code? Unicode has no sense of line numbers as far as I
know, which means it seems up to an individual language to define what line
number a given token is on.


> All of them recognise both characters as newlines (and increment the line
number for those that display it).

Revisiting my tests on my OSX machine, it seems like there is a difference
in treatment of 2028 and 2029 that threw off at least some of my tests.
* VSCode: 2028 is a unicode placeholder and 2029 seems to be rendered
zero-width, no new lines
* Sublime 3: 2028/29 rendered zero-width, no new lines
* TextEdit: 2028 is a newline, 2029 is zero-width, no new lines
* XCode: Per above screenshot, 2028 creates a line but renders within the
same line number, 2029 creates a new line number
* Firefox, Chrome, and Safari, with text in a <pre> or <textarea> renders
them all on one line zero-width, no new lines (though how HTML renders may
just be a whole separate question)


On Fri, Oct 26, 2018 at 7:42 AM Claude Pache <claude.pache at gmail.com> wrote:

>
>
> >
> > Would it be worth exploring a definition of U+2028/29 in the spec such
> that they behave as line terminators for ASI, but otherwise do not
> increment things line number counts and behave as whitespace characters?
>
> Diverging the definition of line terminator for the purpose of line
> counting on one side, and ASI and single-line comment on the other side, is
> adding yet another complication in a matter that is already messy. And I
> suspect that most tools that have issues with the former case, have issues
> as well with the latter case, so that half-fixing is equivalent to not
> fixing.
>
> If we want to ”fix” the definition of line terminator somewhere, we should
> ”fix” it everywhere.
>
> (Note that the recent addition of U+2028 and U+2029 inside string literals
> does not constitutes a modification of the definition of line terminator in
> that context; it is rather allowing string literals to span multiple lines
> in some specific cases.)
>
> —Claude
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20181026/5c6506fc/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Screen Shot 2018-10-26 at 2.33.56 PM.png
Type: image/png
Size: 14073 bytes
Desc: not available
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20181026/5c6506fc/attachment-0001.png>


More information about the es-discuss mailing list