Unicode normalization problem

Boris Zbarsky bzbarsky at mit.edu
Wed Apr 1 23:04:01 UTC 2015


On 4/1/15 6:56 PM, Andrea Giammarchi wrote:
> Why is that

Because those are different things.  The first is a single Unicode 
character that happens to be represented by 2 UTF-16 code units.  The 
second is a pair of Unicode characters that are each represented by one 
UTF-16 code unit, but also happen to form a single grapheme cluster 
(because one of them is a combining character).  To complicate things 
further, there is also a single Unicode character that represents that 
same grapheme cluster....

String length shows the number of UTF-16 code units.

Array.from works on Unicode characters.  That explains the foo.length 
and Array.from(foo).length results.

> and how to solve?

Can you clearly explain what problem you are trying to solve?

-Boris


More information about the es-discuss mailing list