RegExp.prototype.count

Isiah Meadows isiahmeadows at gmail.com
Sun Jan 20 06:22:30 UTC 2019


Nit: you should use `.spilt(/\n/g)` to get all parts.

I like the benchmarks here. That's much appreciated, and after further
investigation, I found a *giant* WTF:
https://jsperf.com/regexp-counting-2/8

TL;DR: for string character counting, prefer `indexOf`.

For similar reasons to that JSPerf thing, I'd like it to be on the
String prototype rather than the RegExp prototype, as in
`str.count(/\n/)`.

-----

Isiah Meadows
contact at isiahmeadows.com
www.isiahmeadows.com

On Sun, Jan 20, 2019 at 12:33 AM kai zhu <kaizhu256 at gmail.com> wrote:
>
> benchmarked @isiah’s while-loop test-case vs str.split vs str.replace for regexp counting on jsperf.com [1], and the results were surprising (for me).
>
> benchmarks using 1mb random ascii-string from fastest to slowest.
> 1. (fastest - 1,700 runs/sec) regexp-counting with ```largeCode.split(/\n/).length - 1```
> 2. (40% slower - 1000 runs/sec) regexp-counting with ```while-loop (/n/g)```
> 3. (60% slower - 700 runs/sec) regexp-counting with ```largeCode.replace((/[^\n]+/g), "").length```
>
> looks like the go-to design-pattern for counting-regexp is ```str.split(<regexp>).length - 1```
>
> [1] regexp counting 2
> https://jsperf.com/regexp-counting-2
>
> On 13 Jan 2019, at 9:15 PM, Isiah Meadows <isiahmeadows at gmail.com> wrote:
>
> If performance is an issue, regular expressions are likely to be too slow to begin with. But you could always do this to count the number of lines in a particular string:
>
> ```js
> var count = 0
> var re = /\n|\r\n?/g
> while (re.test(str)) count++
> console.log(count)
> ```
>
> Given it's already this easy to iterate something with a regexp, I'm not convinced it's necessary to add this property/method.
> On Sat, Jan 12, 2019 at 17:29 kai zhu <kaizhu256 at gmail.com> wrote:
>>
>> a common use-case i have is counting newlines in largish (> 200kb) embedded-js files, like this real-world example [1].  ultimately meant for line-number-preservation purposes in auto-lint/auto-prettify tasks (which have been getting slower due to complexity).
>>
>> would a new RegExp count-method like ```(/\n/g).count(largeCode)``` be significantly more efficient than existing ```largeCode.split("\n").length - 1``` or ```largeCode.replace((/[^\n]+/g), "").length```?
>>
>> -kai
>>
>> [1] calculating and reproducing line-number offsets when linting/autofixing files
>> https://github.com/kaizhu256/node-utility2/blob/2018.12.30/lib.jslint.js#L7377
>> https://github.com/kaizhu256/node-utility2/blob/2018.12.30/lib.jslint.js#L7586
>>
>> _______________________________________________
>> es-discuss mailing list
>> es-discuss at mozilla.org
>> https://mail.mozilla.org/listinfo/es-discuss
>
>


More information about the es-discuss mailing list