Code points vs Unicode scalar values

Mathias Bynens mathias at qiwi.be
Wed Sep 11 03:50:07 PDT 2013


On 10 Sep 2013, at 18:30, Allen Wirfs-Brock <allen at wirfs-brock.com> wrote:

> On Sep 10, 2013, at 12:14 AM, Mathias Bynens wrote:
> 
>> FWIW, here’s a real-world example of a case where this behavior is annoying/unexpected to developers: http://cirw.in/blog/node-unicode
> 
> This suggests to me that the problem is in JSON.stringify's Quote operation.  I can see an argument that Quote should convert all unpaired surrogates to \uXXXX escapes.  I wonder if changing Quote to do this would break anything…

*If* this turns out to be a non-breaking change, it would make sense to have `JSON.stringify` escape any non-ASCII symbols, as well as any non-printable ASCII symbols, similar to `jsesc`’s `json` option [1]. This would improve portability of the serialized data in case it was saved to a misconfigured database, saved to a file with a non-UTF-8 encoding, served to a browser without `charset=utf-8` in the `Content-Type` header, et cetera.

[1] http://mths.be/jsesc#json


More information about the es-discuss mailing list