Look-behind proposal in trouble

Nozomu Katō noz.ka at akenotsuki.com
Wed Oct 7 10:59:17 UTC 2015

What Claude mentioned is already part of the specification: "Input is a
List consisting of all of the characters" and "Each character is either
a code unit or a code point, depending upon the kind of pattern
involved" (

But I added the Note section to the page of my proposal for
clarification two days ago because I was asked a similar question.

Incidentally, in the initial version of the proposal I used the term
"code point" but later changed it to "character" since Allen pointed


Erik Corry wrote on Wed, 7 Oct 2015, at 11:16:54 +0200:
> The proposal needs to be clarified to explain that you are stepping back a
> number of code points, not units.  This implies that you are inspecting the
> input string as you step backwards.  Also it should be explained what to do
> if there are unpaired surrogates in the input string and inside the
> lookbehind expression source.
> I think the proposal would benefit from a pointer to an implementation or
> two.  Of course the implementations should also fully support /u.
> On Wed, Oct 7, 2015 at 11:10 AM, Claude Pache
> wrote:
>> This should not be a problem: With the /u flag, you work with code points,
>> not code units. In particular, the `.` matches always a sequence (of code
>> points with /u, or code units otherwise) of length 1.
>> —Claude

More information about the es-discuss mailing list