RegExp pet peeves (was: should calling RegExp constructor as function without arguments throw?)

Brendan Eich brendan at mozilla.com
Wed Jan 14 14:56:56 PST 2009


This is really a separate thread -- please change the subject  
accordingly.

See also past messages here, which linked to

http://web-graphics.com/2007/11/26/ecmascript-3-regular-expressions-a-specification-that-doesnt-make-sense/
http://blog.stevenlevithan.com/archives/npcg-javascript

If you want access, I will add you to http://bugs.ecmascript.org/ so  
you can file tickets on your peeves.

/be

On Jan 14, 2009, at 2:01 PM, Lasse R.H. Nielsen wrote:

> On Wed, 14 Jan 2009 14:13:13 +0100, Hallvord R. M. Steen <hallvord at opera.com 
> > wrote:
>
>> Apologies if this has already been covered, I tried
>> googling but found only tangentially related stuff about "/regexp/()"
>> syntax.
>
> There are a few parts of the regexp syntax that wouldn't mind a look- 
> over.
>
> My two primary pee-ve's are that look-aheads are Atoms, not  
> Assertions,
> and that back-references to captures occuring later in the source, are
> valid.
>
> The only difference between an Atom and an Assertion is that the  
> former
> can have a quantifier attached. There is absolutely no reason to put a
> quantifier on a look-ahead, and look-aheads are zero-width matches  
> just
> like all assertions, so they would fit much better as assertions.
> Changing the grammar to make look-aheads actual assertions wouldn't  
> even
> require implementations to change. It would just change quantified
> look-aheads from being standard to being an extension, like so many
> other things in regexps already are. (The feature was only added to
> JSC recently - I'm guessing nobody had needed it).
>
> The problem with back-references is that the requirement prevents
> a one-pass parser, because you need to scan the entire regexp to
> know whether a decimal escape is valid. Well, actually it wouldn't
> be a problem if you didn't want to be compatible with all the
> current implementations that treat invalid decimal escapes as
> octal escapes - so you need to know whether a given decimal sequence
> is a valid back-reference in order to parse it as octal if it isn't
> valid.
> At least IE6 actually limits the valid back-references to the
> captures that were started previous to the back-reference in the
> source. That's a reasonable approach from a parsing perspective
> (I'd be happy if that was what was required), but really you only
> need to be able to reference captures that can be completed at the
> point where they occour, i.e., where both the start and end  
> parentheses
> of the capture being referenced occur prior to the back-reference in
> the source.
>
>
> /L
> -- 
> Lasse R.H. Nielsen
> Speaking only for myself ... if even that.
> 'Faith without judgement merely degrades the spirit divine'
> _______________________________________________
> Es-discuss mailing list
> Es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20090114/eeb83d63/attachment.html>


More information about the Es-discuss mailing list