Regexp capturing groups.
marjar-4 at student.ltu.se
Thu Sep 4 07:02:19 PDT 2008
When I first noticed this in Firefox I thought it was a bug. After some
investigation it turns out that the problem was in the specification.
What I am talking about is that ES discards the capturing groups on
repetition. I don't know of any Regexp engine which is not based on the
ECMA-262 standard that behaves like this.
A simple example:
/(?:(a)|(b))*/.exec("ababa") -> ["ababa", "a", ""]
It recognizes each letter in turn, but when it is time to match the next
one it discards the result of the last repetition.
A little more practical example, URL query key/value matching:
var match =
var value1 = match;
var value2 = match;
This would on most other Regexp engines store the value after key1 in
group 1, and the value after key2 in group 2, independent on the order
in the input-string. But on ECMA-262 based engines, only the last
matching value would be kept.
The same technique could be applied to attributes in HTML-tags.
To get this to work with ECMA-262 based engines, you could first pick
out the query-string with one Regexp, and then look for each key in turn.
var query = /\/thread\.php(\?[^#]*)/.exec(url);
var value1 = /[&?]key1=([^&#]*)/.exec(query);
var value2 = /[&?]key2=([^&#]*)/.exec(query);
I don't know if any web application depends on this behavior, but I
wouldn't write any code that did.
This problem, and that of back-references to non-participating groups,
have been discussed on this list before, but nothing seems to have come
out of it.
More information about the Es-discuss