Code points vs Unicode scalar values
mathias at qiwi.be
Wed Sep 4 09:41:14 PDT 2013
On 4 Sep 2013, at 18:34, Brendan Eich <brendan at mozilla.com> wrote:
> Here, from the latest ES6 draft, is 184.108.40.206 String.fromCodePoint ( ...codePoints):
> The String.fromCodePoint function may be called with a variable number of arguments which form the
> rest parameter codePoints. The following steps are taken:
> 1. Assert: codePoints is a well-formed rest parameter object.
> 2. Let length be the result of Get(codePoints, "length").
> 3. Let elements be a new List.
> 4. Let nextIndex be 0.
> 5. Repeat while nextIndex < length
> a. Let next be the result of Get(codePoints, ToString(nextIndex)).
> b. Let nextCP be ToNumber(next).
> c. ReturnIfAbrupt(nextCP).
> d. If SameValue(nextCP, ToInteger(nextCP)) is false,then throw a RangeError exception.
> e. If nextCP < 0 or nextCP > 0x10FFFF, then throw a RangeError exception.
> f. Append the elements of the UTF-16 Encoding (clause 6) of nextCP to the end of elements.
> g. Let nextIndex be nextIndex + 1.
> 6. Return the String value whose elements are, in order, the elements in the List elements. If length is 0, the
> empty string is returned.
> No exposed surrogates here!
I think what Anne means to say is that `String.fromCodePoint(0xD800)` returns '\uD800` as per that algorithm, which is a lone surrogate (and not a scalar value).
More information about the es-discuss