Suggested RegExp Improvements
erik.corry at gmail.com
Mon Nov 15 05:06:24 PST 2010
Your proposal seems to allow variable length lookbehind. This isn't
allowed in perl as far as I know. I just tried the following:
perl -e '"foobarbaz" =~ /a(?<=(ob|bab))/;'
which gives an error on perl5. I think if we are going to allow
variable length lookbehind we should first find out why they don't
have it in perl. I think the implementation is a little tricky if you
want to support the full regexp language in lookbehinds.
Is there an example of a language that supports the full regexp power
in lookbehinds so we can look at their experiences with implementing
2010/11/15 Marc Harter <wavded at gmail.com>:
> Brendan et al.,
> I have created a proposal for look-behind provided at this link:
> I hope it is a format that will be helpful for discussion with TC39.
> Admittedly, I have never written one of these before so am completely open
> to any feedback or ways to improve the document from yourself or anyone else
> on this list.
> On Sat, 2010-11-13 at 09:32 -0600, Marc Harter wrote:
> I would be game to write up a proposal for this. When would you need
> this by to discuss w/ TC39?
> Thanks for your consideration,
> On Nov 12, 2010, at 5:04 PM, Brendan Eich <brendan at mozilla.com> wrote:
>> On Nov 12, 2010, at 2:52 PM, Marc Harter wrote:
>>> After considering all the breadth this discussion could take maybe it
>>> would be wise to just focus on one issue at a time. For me, the biggest
>>> missing feature is lookbehind. Its common to most languages
>>> implementing the Perl-RegExp-syntax, it is very useful when looking for
>>> patterns that follow or don't follow a particular pattern. I guess I'm
>>> confused why lookahead made it in but not lookbehind.
>> This was 1998, Netscape 4 work I did in '97 was based on Perl 4(!), but we
>> proposed to ECMA TC39 TG1 (the JS group -- things were different then,
>> including capitalization) something based on Perl 5. We didn't get
>> everything, and we had to rationalize some obvious quirks.
>> I don't remember lookbehind (which emerged in Perl 5.005 in July '98)
>> being left out on purpose. Waldemar may recall more, I'd handed him the JS
>> keys inside netscape.com to go do mozilla.org.
>> If you are game to write a proposal or mini-spec (in the style of ES5
>> even), let me know. I'll chat with other TC39'ers next week about this.
>>> What do people
>>> think about including this feature?
>>> On Fri, 2010-11-12 at 16:20 -0600, Marc Harter wrote:
>>>> I will start out with a disclaimer. I have not read both ECMAScript
>>>> specifications for 3 and now 5, so I admit that I am not an expert in
>>>> expert discussion over this topic as proposed enhancements to the
>>>> RegExp engine for Harmony.
>>>> I will start with a list of lacking features in JS as compared to Perl
>>>> * No \A or \Z anchors to match the start or end of the string.
>>>> Use a caret or dollar instead.
>>>> * Lookbehind is not supported at all. Lookahead is fully
>>>> * No atomic grouping or possessive quantifiers
>>>> * No Unicode support, except for matching single characters with
>>>> * No named capturing groups. Use numbered capturing groups
>>>> * No mode modifiers to set matching options within the regular
>>>> * No conditionals.
>>>> * No regular expression comments. Describe your regular
>>>> regular expression string.
>>>> I don't know if all of these "need" to be in the language but there
>>>> have been some that I have personally wanted to use:
>>>> * Lookbehind! ECMAScript fully supports lookahead, why not
>>>> lookbehind? Seems like a big hole to me.
>>>> * Named capturing groups and comments (e.g.
>>>> http://xregexp.com/syntax/). Mostly I argue for this because
>>>> it makes RegExp matches more self-documenting. Regular
>>>> Expressions are already cryptic as it is.
>>>> I do like some of the new flags proposed in
>>>> (http://xregexp.com/flags/) but personally haven't used them but maybe
>>>> that is something also for discussion.
>>>> Marc Harter
>>> es-discuss mailing list
>>> es-discuss at mozilla.org
> es-discuss mailing list
> es-discuss at mozilla.org
More information about the es-discuss