RegExp lookbehind

Lasse Reichstein reichsteinatwork at gmail.com
Sat Mar 17 16:32:44 PDT 2012


On Sat, Mar 17, 2012 at 2:28 PM, Andrea Giammarchi
<andrea.giammarchi at gmail.com> wrote:
> I would simply apply same logic we have already for the look ahead ... or
> you think that would cause problems?

I'm not sure it even makes sense.

ES RegExps are backtracking based, and it makes a difference in which
order alternatives are tried. Greedy matching is defined in terms of
number of repetitions, not length of the match. All of these are
defined in a way that assumes left-to-right matching.

Example:
 Take the RegExp  /(?<((?:aa|aaa)+))b/  where (?< ... ) delimits the
look-behind.
 and try matching it on the string "xaaaaaaaaab".
 Then tell me how many a's are captured by the capturing group, and why :)

The most "intuitive" interpretation would be a reverse implementation
of the normal matching algorithm, i.e., "backwards matching", but that
would likely duplicate the entire RegExp semantics (or parameterize it
by a direction).

Any attempt to use the normal (forward) semantics and then try to find
an earlier point to start it at is likely to be either flawed or
effectively unpredictable to users.

And you will probably never achieve that /(<re>)$/ and /(?<(re))$/
always capture the same substring :)

/L


More information about the es-discuss mailing list