Look-behind proposal in trouble

Erik Corry erik.corry at gmail.com
Wed Oct 7 09:16:54 UTC 2015

The proposal needs to be clarified to explain that you are stepping back a
number of code points, not units.  This implies that you are inspecting the
input string as you step backwards.  Also it should be explained what to do
if there are unpaired surrogates in the input string and inside the
lookbehind expression source.

I think the proposal would benefit from a pointer to an implementation or
two.  Of course the implementations should also fully support /u.

On Wed, Oct 7, 2015 at 11:10 AM, Claude Pache <claude.pache at gmail.com>

> This should not be a problem: With the /u flag, you work with code points,
> not code units. In particular, the `.` matches always a sequence (of code
> points with /u, or code units otherwise) of length 1.
> —Claude
> Le 7 oct. 2015 à 10:08, Erik Corry <erik.corry at gmail.com> a écrit :
> Oops forgot the /u on the regexp in the example.
> On Wed, Oct 7, 2015 at 10:06 AM, Erik Corry <erik.corry at gmail.com> wrote:
>> Your proposal for look-behind relies on being able to count the match
>> length of the look-behind in order to step back that far.  This presupposes
>> that atoms like . and character classes have a fixed length.
>> However, with the /u flag, the . and some character classes can be either
>> 1 or two code units.  This means you don't know how far to step back.  This
>> needs to be fixed in a way that is not incompatible with the "correct" .NET
>> way of doing things.
>> Eg matching /a.(?<!x..)/ against "xa😹"  (x, a,
>> cat-face-with-tears-of-joy, which is a surrogate pair).  The back reference
>> has an apparent width of 3, so we step back 3 code units, but that hits the
>> 'a', not the 'x' and so the back reference fails to spot the 'x'.
>> On Sun, Oct 4, 2015 at 1:52 PM, Nozomu Katō <noz.ka at akenotsuki.com>
>> wrote:
>>> Apparently my proposal for adding the look-behind assertions to RegExp
>>> has been in trouble. I would like to ask anyone for help.
>>> The following story is what I know about the proposal after my previous
>>> post:
>>> I created a pull request for the proposal in July and sent an email to
>>> Brendan Eich asking if I can put his name as a champion:
>>> https://github.com/tc39/ecma262/pull/48
>>> I have not received a reply to my email, but I received a notification
>>> email in September that replying to the pull request, the proposal was
>>> moved to stage 0. Today, however, I just noticed that the proposal had
>>> been dropped from stage 0, stating "RegExp lookbehind has no champion".
>>> https://github.com/tc39/ecma262/commits/master/stage0.md (Oct 4, 2015)
>>> I am uncertain about what happened. Does this mean that Brendan Eich is
>>> no longer a champion or did not take a champion on from the beginning or
>>> ...?
>>> Regards,
>>>   Nozomu
>>> _______________________________________________
>>> es-discuss mailing list
>>> es-discuss at mozilla.org
>>> https://mail.mozilla.org/listinfo/es-discuss
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20151007/1cae502a/attachment-0001.html>

More information about the es-discuss mailing list