How to count the number of symbols in a string?

David Bruant bruant.d at gmail.com
Tue Dec 4 11:43:29 PST 2012


Le 04/12/2012 20:25, Jason Orendorff a écrit :
> On Sat, Dec 1, 2012 at 2:09 AM, Mathias Bynens <mathias at qiwi.be 
> <mailto:mathias at qiwi.be>> wrote:
>
>     On 30 Nov 2012, at 22:50, Norbert Lindenberg
>     <ecmascript at norbertlindenberg.com
>     <mailto:ecmascript at norbertlindenberg.com>> wrote:
>
>     > There's nothing in the proposal yet because I intentionally kept
>     it small. It's always possible to add functionality, but we need
>     some evidence that it will be widely used.
>
>     My guess would be that in 99% of all cases where
>     `String.prototype.length` is used the intention is to count the
>     code points, not the UCS-2/UTF-16 code units.
>
>
> I don't think this is right. My guess is that in most cases where it 
> matters either way, the intention is to get a count that's consistent 
> with .charAt(), .indexOf(), .slice(), RegExp match.index, and every 
> other place where string indexes are used.
I think Twitter has a bug as mentioned earlier in the thread and that's 
unrelated to consistency with the method you're mentioning.
I however agree that if something is added to get the actual length, a 
whole set of methods needs to be added too.

> That said, of course this is a sensible feature to add; but calling it 
> ".realLength" wouldn't help anyone understand the rather fine 
> distinction at issue.
Maybe the solution lies in finding the right prefix to define .*length, 
.*charAt(), .*indexOf(), etc. Maybe "CP" for "code points" .CPlength? 
.cpLength/cpCharAt/cpIndexOf... ?

While you're talking about regexps, I think there is an issue with 
current RegExps. Mathias will know better. Could a new flag solve the issue?

David
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20121204/52cccbac/attachment-0001.html>


More information about the es-discuss mailing list