Last call for consensus on format-control char. issues

John Cowan cowan at ccil.org
Wed Jun 17 12:36:31 PDT 2009


Allen Wirfs-Brock scripsit:

> My sense from the F2F is that the consensus was more in the direction
> of my simple solution above (<ZWNJ> and <ZWJ> in identifiers, <BOM>
> is whitespace) rather than David-Sarah's more comprehensive treatment
> of <BOM>.

The two solutions are consistent on the subject of ZWJ and ZWNJ.  I am
concerned about treating BOM as whitespace, primarily because it's not,
well, rendered with white space.  Treating BOM as whitespace means that
"foo<BOM>bar" is the same as "foo bar" and distinct from "foobar",
although it looks exactly like "foobar", and similarly with "foo<BOM>123".

David-Sarah proposes treating BOM in an identifier, string literal,
or other token as an error, since the intent is unclear (non-initial
BOMs shouldn't appear in properly formatted text any more, now that the
"zero-width separator" semantics has been taken over by U+2060 WORD
JOINER), and ignoring all other instances of BOM.  That seems like TRT
to me.

-- 
If I read "upcoming" in [the newspaper]              John Cowan
once more, I will be downcoming                      http://www.ccil.org/~cowan
and somebody will be outgoing.                       cowan at ccil.org


More information about the es5-discuss mailing list