<NEL> considered harmful, especially as whitespace

Thu May 28 11:26:26 PDT 2009

David-Sarah Hopwood scripsit:

> public class Test {
>   public static void main(String[] args) {
>     System.out.println(Character.isUnicodeIdentifierPart(0x85));
>   }
> }
> prints true in JDK/JRE 1.6. (OK, this is not an authoritative source, and
> I probably should be going to the Unicode data files.) Java bug?

Has to be.

> Thought so. These are zero-width characters, so they are not suitable
> as token separators. Perhaps <BOM> (a.k.a. <ZWNBSP>) should be allowed,
> but only at what would already be a token boundary.

ZWNBSP is deprecated anyhow except in its role as BOM, though the point
about raw concatenation of BOMful files generating embedded ZWNBSPs is
a good one.

Is not a patron, my Lord [Chesterfield],        John Cowan
one who looks with unconcern on a man           http://www.ccil.org/~cowan
struggling for life in the water, and when      cowan at ccil.org
he has reached ground encumbers him with help?
        --Samuel Johnson

