Expectations around line ending behavior for U+2028 and U+2029

Isiah Meadows isiahmeadows at gmail.com
Fri Oct 26 23:55:12 UTC 2018


So in other words, all these IDEs are broken and in violation of the
Unicode spec. BTW, VSCode depends on Chrome, so it'll likely have most
of the same behavior if it doesn't correctly account for them..

-----

Isiah Meadows
contact at isiahmeadows.com
www.isiahmeadows.com

On Fri, Oct 26, 2018 at 5:49 PM Logan Smyth <loganfsmyth at gmail.com> wrote:
>
> Great, thank you for that resource Allen, it's helpful to have something concrete to consider.
>
> What you'd prefer is that that other languages should also be rendered with U+2028/29 as creating new lines, even though their specifications do not define them as lines? That means that any parser for these languages that follows the language spec would them be outputting line numbers that would potentially not correspond with the code as rendered inside of the developer's editor, if the editor renders U+2028/29 a line separators? That would for instance mean that Rust's single-line comments could actually be rendered as multiple lines, even though they are a single line according to the spec.
>
> My frustration here isn't that the characters exist, it's just that their behavior in a world of explicitly defined syntactic grammars that depend on line numbers for errors and things, they seem poorly-defined, even if their behavior in text documents may have more meaning. For instance, here is XCode's rendering of 2028/2029
>
>
> 2028 does seem to render as a "line separator" in that visually the code is on a new line, but it is rendered within the same line number marker as the start of that snippet of text. That seems to satisfy the behavior defined by Unicode, but it's not helpful from the standpoint of code looking to process sourcecode. Should a parser follow that definition of line separator, since 2028 suggests rendering a new line, but since it's not a paragraph, it's conceptually part of the same paragraph? What is a paragraph in source code? Unicode has no sense of line numbers as far as I know, which means it seems up to an individual language to define what line number a given token is on.
>
>
> > All of them recognise both characters as newlines (and increment the line number for those that display it).
>
> Revisiting my tests on my OSX machine, it seems like there is a difference in treatment of 2028 and 2029 that threw off at least some of my tests.
> * VSCode: 2028 is a unicode placeholder and 2029 seems to be rendered zero-width, no new lines
> * Sublime 3: 2028/29 rendered zero-width, no new lines
> * TextEdit: 2028 is a newline, 2029 is zero-width, no new lines
> * XCode: Per above screenshot, 2028 creates a line but renders within the same line number, 2029 creates a new line number
> * Firefox, Chrome, and Safari, with text in a <pre> or <textarea> renders them all on one line zero-width, no new lines (though how HTML renders may just be a whole separate question)
>
>
> On Fri, Oct 26, 2018 at 7:42 AM Claude Pache <claude.pache at gmail.com> wrote:
>>
>>
>>
>> >
>> > Would it be worth exploring a definition of U+2028/29 in the spec such that they behave as line terminators for ASI, but otherwise do not increment things line number counts and behave as whitespace characters?
>>
>> Diverging the definition of line terminator for the purpose of line counting on one side, and ASI and single-line comment on the other side, is adding yet another complication in a matter that is already messy. And I suspect that most tools that have issues with the former case, have issues as well with the latter case, so that half-fixing is equivalent to not fixing.
>>
>> If we want to ”fix” the definition of line terminator somewhere, we should ”fix” it everywhere.
>>
>> (Note that the recent addition of U+2028 and U+2029 inside string literals does not constitutes a modification of the definition of line terminator in that context; it is rather allowing string literals to span multiple lines in some specific cases.)
>>
>> —Claude
>
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss


More information about the es-discuss mailing list