Regexp APIs and capturing group positions

Mark Macdonald mamacdon at gmail.com
Thu Jul 12 11:26:04 PDT 2012


In ES 5.1, the regular expression APIs do not expose the index at which a
capturing group matched. The RegExp.prototype.exec(string) function returns
an Array giving (among other things) the *text* matched by capturing
groups, but does not give the *positions *of the captured text within the
input string.

For example, consider this code using the current regex APIs:

var match = /(fox).*(dog)/.exec("The quick brown fox jumps over the lazy
dog");
match[1]; // "fox"
match[2]; // "dog"

We want to get this:
"fox" at index 16
"dog" at index 40

But there is no way to obtain the indices 16, 40 from the match object (or
any other API I'm aware of). This makes it hard to write something like a
regex coach, which takes an arbitrary regular expression and input string,
and outputs a highlighted version of the input string showing where the
capturing groups matched.

Proposal: When RegExp.prototype.exec(string) returns a nonnull value, the
returned object shall have a property named "captures", which is an Array.
The value of captures[n] is the index at which the n'th capturing group's
match begins. As usual, groups are numbered from 1. The captures array does
not have a "0" property (it would always be equal to the "index" property
of the match object, and thus redundant).

Proposed code:

var match = /(fox).*(dog)/.exec("The quick brown fox jumps over the lazy
dog");
match.captures[1]; // 16
match.captures[2]; // 40

This (combined with the group text from the match object) gives you enough
information to enumerate the captured regions of the input string.

Prior art: Java's java.util.Matcher.start() [1], Python's
re.MatchObject.start() [2].

Comments, suggestions?

Mark

[1]
http://docs.oracle.com/javase/1.5.0/docs/api/java/util/regex/Matcher.html#start%28int%29
[2] http://docs.python.org/library/re.html#match-objects
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20120712/09b81485/attachment-0001.html>


More information about the es-discuss mailing list