Q: Lonely surrogates and unicode regexps

Allen Wirfs-Brock allen at wirfs-brock.com
Wed Jan 28 08:14:13 PST 2015


On Jan 28, 2015, at 5:26 AM, Mark Davis ☕️ <mark at macchiato.com> wrote:

> I think the cleanest mental model is where UTF-16 or UTF-8 strings are interpreted as if they were transformed into UTF-32.

This is exactly the approach used  in the ES6 spec (except that it doesn’t deal with UTF-8)

> 
> While that is generally feasible, it often represents a cost in performance which is not acceptable in practice. So you see various approaches that involve some deviation from that mental model.

While ES6 uses this approach in its specification, implementations are free to use any implementation technique that produces the same result.

Allen


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20150128/12c30650/attachment.html>


More information about the es-discuss mailing list