Unicode normalization problem

Andrea Giammarchi andrea.giammarchi at gmail.com
Wed Apr 1 23:07:51 UTC 2015


```js
foo.length; // 2
Array.from(foo).length // 1

bar.length; // 2
Array.from(bar).length // 2
```

I know already everything you wrote ... now, how to explain to JS users out
there and how to solve?

On Thu, Apr 2, 2015 at 1:04 AM, Boris Zbarsky <bzbarsky at mit.edu> wrote:

> On 4/1/15 6:56 PM, Andrea Giammarchi wrote:
>
>> Why is that
>>
>
> Because those are different things.  The first is a single Unicode
> character that happens to be represented by 2 UTF-16 code units.  The
> second is a pair of Unicode characters that are each represented by one
> UTF-16 code unit, but also happen to form a single grapheme cluster
> (because one of them is a combining character).  To complicate things
> further, there is also a single Unicode character that represents that same
> grapheme cluster....
>
> String length shows the number of UTF-16 code units.
>
> Array.from works on Unicode characters.  That explains the foo.length and
> Array.from(foo).length results.
>
>  and how to solve?
>>
>
> Can you clearly explain what problem you are trying to solve?
>
> -Boris
>
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20150402/0c988c60/attachment.html>


More information about the es-discuss mailing list