Unicode normalization problem

Alexander Guinness monolithed at gmail.com
Wed Apr 1 19:17:18 UTC 2015


My reasoning is based on the following example:

```js
var text = '𝐀';

text.length; // 2

Array.from(text).length // 1
```

2015-04-01 22:05 GMT+03:00 Rick Waldron <waldron.rick at gmail.com>:

>
>
> On Wed, Apr 1, 2015 at 2:59 PM monolithed <monolithed at gmail.com> wrote:
>
>> ```js
>> var text = 'ЙйЁё';
>>
>> text.split(''); // ["И", "̆", "и", "̆", "Е", "̈", "е", "̈"]
>> ```
>>
>> Possible solutions:
>>
>> 1.
>>
>> ```js
>> text.normalize().split('') // ["Й", "й", "Ё", "ё"]
>> ```
>>
>> I like it, but is no so comfortable
>>
>> 2.
>>
>> ```js
>> Array.from(text) // ["И", "̆", "и", "̆", "Е", "̈", "е", "̈"]
>> ```
>>
>> 3.
>>
>> ```js
>> [...text] // ["И", "̆", "и", "̆", "Е", "̈", "е", "̈"]
>> ```
>>
>>
>> Should the `Array.from` and `...text` work as the first example and why?
>>
>
> Why would they imply calling `normalize()`? What if that wasn't desired?
>
> Since #1 calls normalize before split(), the actual equivalents would look
> like this:
>
>   Array.from(text.normalize()) // [ "Й", "й", "Ё", "ё" ]
>   [...text.normalize()] // [ "Й", "й", "Ё", "ё" ]
>
> Rick
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20150401/ac2b54d8/attachment.html>


More information about the es-discuss mailing list