Regex on substrings

Peter van der Zee ecma at qfox.nl
Thu Jun 2 04:29:58 PDT 2011


A problem I faced recently is the inability to apply regular
expressions to a substring of a string without explicitly taking the
substring first. So I'm wondering how much trouble it would be to
extend the RegExp api to this...

RegExp.prototype.test = function(string[, start=0[, stop=string.length]]){ };
RegExp.prototype.exec = function(string[, start=0[, stop=string.length]]){ };

The regular expression would only be applied to that part of the
string. It'd be (almost) the same as
regex.test(string.substring(start, stop)), except the substringing is
handled internally.

I can't think of any backward compatibility issues for this change.

My use case is that I have a set of words I want to find in a certain
input string, but almost always starting at pos>0. Right now I have to
take the substring of the longest possible match (or remaining of
input) and check the results of exec/match to see the length of the
match, if any. The alternative is to compile a regular expression that
"skips" the first n characters.

Optionally, it might be handy to have .test return a number,
indicating the length of the (first) match. If zero, there was no
match. This would however break with scripts that explicitly check for
=== false.

It seems to me like applying regular expressions to substrings could
be optimized internally much better (using pointers) than having to do
a substring in ES every time. This should speed up the parsing of
input, for instance.

- peter


More information about the es-discuss mailing list