Full Unicode strings strawman

Mike Samuel mikesamuel at gmail.com
Mon May 16 12:29:40 PDT 2011

2011/5/16 Shawn Steele <Shawn.Steele at microsoft.com>:
>> > myString.replace( /[\ud800-\udbff](?![\udc00-\uffff])/g, "\ufffd")
>> >    .replace( /(^|[^\ud800-\udbff])([\udc00-\udffff])/g, "\ufffd")

My example code has typos.  It should have read

    myString.replace( /[\ud800-\udbff](?![\udc00-\udfff])/g, "\ufffd")
        .replace( /(^|[^\ud800-\udbff])([\udc00-\udfff])/g, "\ufffd")

>> Exactly as it currently does, assuming it was applied to a string that didn't contain any codepoints greater than \uffff.
>> If the string contained any codepoints > \uffff those character would not match the pattern should be replaced.
> Isn't that breaking?  I'm not sure how you can treat these characters distinctly as some code point from d800-dfff sometimes and as a codepoint > 0xffff at other times.
> -Shawn

More information about the es-discuss mailing list