Backward running version look-behinds (Re: Look-behind proposal in trouble)

Claude Pache claude.pache at gmail.com
Thu Nov 19 19:58:41 UTC 2015


> Le 19 nov. 2015 à 20:17, Nozomu Katō <noz.ka at akenotsuki.com> a écrit :
> 
> On Thu, 19 Nov 2015 09:04:41 +0100, Yang Guo wrote:
>> This implementation supports variable length lookbehind similar to .NET's
>> semantics. It does so by emitting code to read backwards inside the
>> lookbehind. The size of the change without platform ports and tests is
>> about 600 lines.
> 
> When regular expressions inside a look-behind are evaluated backwards,
> the automaton will not have a chance to know what has been captured in
> $1 in /(?<=(.)$1)/, since the automaton reaches $1 prior to (.).
> 
> This can be fixed by defining that a backreference referring to a
> capture inside the same look-behind has to be put leftwards contrary to
> the usual rule, but saying so would impose reading regular expressions
> backwards or prefetching forward expressions inside a look-behind upon
> the parser, to know if the corresponding capturing parentheses exist.
> 
> I am not sure if it is easy or there is a simpler solution, anyhow, if
> ECMAScript supports .NET-like look-behinds this matter would need to be
> fixed.
> 
> Nozomu

There is always the coward solution: forbid backreferences inside lookbehinds.

If we are going to ask programmers to write /(?<\1(.))/, it may be a good idea to prevent footguns like /(?<(.)\1)/ by throwing a syntax error (for it can be statically determined that it will not work as intended).

—Claude



More information about the es-discuss mailing list