Backward running version look-behinds (Re: Look-behind proposal in trouble)
claude.pache at gmail.com
Thu Nov 19 19:58:41 UTC 2015
> Le 19 nov. 2015 à 20:17, Nozomu Katō <noz.ka at akenotsuki.com> a écrit :
> On Thu, 19 Nov 2015 09:04:41 +0100, Yang Guo wrote:
>> This implementation supports variable length lookbehind similar to .NET's
>> semantics. It does so by emitting code to read backwards inside the
>> lookbehind. The size of the change without platform ports and tests is
>> about 600 lines.
> When regular expressions inside a look-behind are evaluated backwards,
> the automaton will not have a chance to know what has been captured in
> $1 in /(?<=(.)$1)/, since the automaton reaches $1 prior to (.).
> This can be fixed by defining that a backreference referring to a
> capture inside the same look-behind has to be put leftwards contrary to
> the usual rule, but saying so would impose reading regular expressions
> backwards or prefetching forward expressions inside a look-behind upon
> the parser, to know if the corresponding capturing parentheses exist.
> I am not sure if it is easy or there is a simpler solution, anyhow, if
> ECMAScript supports .NET-like look-behinds this matter would need to be
There is always the coward solution: forbid backreferences inside lookbehinds.
If we are going to ask programmers to write /(?<\1(.))/, it may be a good idea to prevent footguns like /(?<(.)\1)/ by throwing a syntax error (for it can be statically determined that it will not work as intended).
More information about the es-discuss