Oddly accepted RegExps

Claude Pache claude.pache at gmail.com
Fri Jun 3 20:41:36 UTC 2016


> Le 3 juin 2016 à 10:20, Isiah Meadows <isiahmeadows at gmail.com> a écrit :
> 
> These three RegExps don't appear valid, even after reading the Annex B, but they do behave consistently in both Chrome and Firefox. They are listed here with equivalent regexps:
> 
> - `/[[]/` -> `/\[\[\]/`
> - `/[]]/` -> `/(?!)/` (i.e. nothing)
> - `/a{,,/` -> `/a\{,,+/`
> 
> Is this a spec bug or an implementation bug in the parsing? 

The first pattern is conform to the syntax and semantics given in the main part of the spec. The most relevant rule in the grammar of https://tc39.github.io/ecma262/#sec-regular-expressions-patterns is:

    ClassAtomNoDash ::
        SourceCharacter  but not one of \ or ] or -
    
In particular an unescaped `[` is an acceptable atom inside a character class.

The last two ones are well specified by main part modified with annex b. The second pattern starts with an empty class, which is a valid way to not match anything. And the most relevant rule in Annex B grammar in https://tc39.github.io/ecma262/#sec-regular-expressions-patterns is:

    ExtendedPatternCharacter::
        SourceCharacter  but not one of  ^  $  .  *  +  ?  (  )  [  |

In particular, `]` and `{` may appear unescaped outside character class (with the restriction that `{` is not at the start of a sequence that resemble a quantifier, which case is taken care by the `InvalidBracedQuantifier` production).

—Claude


More information about the es-discuss mailing list