Disabling nonstandard RegExp functionalities for proper subclasses of RegExp

Claude Pache claude.pache at gmail.com
Wed Apr 27 13:34:25 UTC 2016


> Le 27 avr. 2016 à 06:50, Andrea Giammarchi <andrea.giammarchi at gmail.com> a écrit :
> 
> My 2 cents.
> 
> I always had the feeling people complaining about `RegExp.$1` and friends never really used them.
> 
> For instance, your example:
> 
> ```js
> /(a)/.exec('a')
> Object.keys(bar)
> RegExp.$1
> ```
> 
> might have side effects but it's also made up and, I believe, not a real-world concern.
> If you use `re.exec` you address it, otherwise you go `re.test` while if you use `re.test` you (or at least *me*) are aware of possible side effects and will use the `RegExp.$1` or other property *instantly* after.
> 
> Following just a simple example:
> 
> ```js
> if (/^item-(\d+)$/.test(id)) {
>   let num = parseInt(RegExp.$1, 10);
>   // the rest of the code
> }
> ```
> 
> I'm not saying these properties are cool, always safe, or anything, I'm just saying there are few useful cases for them.

An important point in my made-up example with `Object.keys()`, is that such an apparently innocuous function call might be hidden deep in the subclass implementation. That was an illustration of a way (among others) how a user-defined subclass could make the built-in RegExp.$1 feature brittle or buggy. (And no, I haven't taken time to find a realistic example, and I won't.)

> 
> However, while I agree the problem is that these modify the globally shared constructor, I also think that having these magic properties available in a subclass only, would be probably the key to solve pretty much all problems described in that page.
> 
> ```js
> reg.exp = class extends RegExp {};
> function reg(source, ...flags) {
>   return typeof source === 'string' ?
>     new reg.exp(...[source, ...flags]) :
>     new reg.exp(source.source, source.flags);
> }
> 
> if (reg(/^item-(\d+)$/).test(id)) {
>   let num = parseInt(reg.exp.$1, 10);
>   // the rest of the code
> 
>   RegExp.$1 === reg.exp.$1; // false
>   // the extended RegExp didn't modify
>   // the global RegExp
> }
> ```
> 
> Of course this would still suffer same, or very similar, problems in case `re.class` is exported as module and consumed by many different libraries, but I would be surprised if subclassing `RegExp` will create a limited subset or, at least, I wouldn't call that an extend.

Of course, subclasses of RegExp are free to define their own `reg.exp.$1` static properties, insomuch that they are free to implement features that other people judge horrific.

In any case, it might be good that `reg.exp.$1` does not inherit from `RegExp.$1` by default (as it is currently the case), in case the semantics and implementation of `reg.exp` renders the value of the inherited property misleading.


> 
> After all, if "safe" is the main concern, writing `class SRegExp extends RegExp {}` on top of a module that uses RegExp in various ways doesn't look like a big piece of extra code to write, it's like a function declaration and it will make your code immune from global RegExp gotchas.
> 
> Or ... doesn't it?

I think that safety is not the main concern here. I mean, safety is better served by completely removing RegExp.$1 and friends, including for plain regexps, which, although it couldn't be achieved by default, can be made possible by leaving to the user the ability to remove completely the API (e.g., by making RegExp.$1 configurable and deletable) and by limiting cross-realm leaks. I.e., by rendering the entire environment safe instead of leaving that task to individual modules.

—Claude



More information about the es-discuss mailing list