RegExp.escape()

Mark S. Miller erights at google.com
Sat Jun 13 18:07:03 UTC 2015


On Sat, Jun 13, 2015 at 9:17 AM, Domenic Denicola <d at domenic.me> wrote:

>  All of these should be building on top of RegExp.escape :P
>

I am not yet agreeing or disagreeing with this. Were both to become std,
clearly they should be consistent with each other. At the time I wrote
this, it had not occurred to me that the tag itself might be stdized at the
same time as RegExp.escape. Now that this possibility has been proposed, I
am realizing lots of flaws with my polyfill. It's funny how, by considering
it as leading to a proposal, I quickly saw deep flaws that I was previously
missing.

* The big one is that the literal template parts that are taken to
represent the regexp pattern fragments being expressed should be
syntactically valid *fragments*, in the sense that it makes semantic sense
to inject data between these fragments. Escaping the data + validating the
overall result does not do this. For example:

    const data = ':x';
    const rebad = RegExp.tag`(?${data})`;
    console.log(rebad.test('x')); // true

is nonsense. Since the RegExp grammar can be extended per platform, the
same argument that says we should have the platform provide RegExp.escape
says we should have the platform provide RegExp.tag -- so that they can
conisistently reflect these platform extensions.

* Now that we have modules, I would like to see us stop having each
proposal for new functionality come at the price of further global
namespace pollution. I would like to see us transition towards having most
new std library entry points be provided by std modules. I understand why
we haven't yet, but something needs to go first.

* ES6 made RegExp subclassable with most methods delegating to a common
@exec method, so that a subclass only needs to consistently override a
small number of things to stay consistent. Neither RegExpSubclass.escape
nor RegExpSubclass.tag can be derived from aRegExpSubclass[@exec]. Because
of the first bullet, RegExpSubclass.tag also cannot be derived from
RegExpSubclass.escape. But having RegExpSubclass.escape delegating to
RegExpSubclass.tag seem weird.

* The instanceof below prevents this polyfill from working cross-frame.
Also, when doing RegExpSubclass1.tag`xx${aRegExpSubclass2}yy`, where
RegExpSubclass2.source produces a regexp grammar string that
RegExpSubclass1 does not understand, I have no idea what the composition
should do other than reject with an error. But what if the strings happen
to be mutually valid but with conflicting meaning between these subclasses?




>
>
> *From:* es-discuss [mailto:es-discuss-bounces at mozilla.org] *On Behalf Of *Mark
> S. Miller
> *Sent:* Saturday, June 13, 2015 02:39
> *To:* C. Scott Ananian
> *Cc:* Benjamin Gruenbaum; es-discuss
> *Subject:* Re: RegExp.escape()
>
>
>
> The point of this last variant is that data gets escaped but RegExp
> objects do not -- allowing you to compose RegExps:
> re`${re1}|${re2}*|${data}`
> But this requires one more adjustment:
>
>
> >
> >   function re(first, ...args) {
> >     let flags = first;
> >     function tag(template, ...subs) {
> >       const parts = [];
> >       const numSubs = subs.length;
> >       for (let i = 0; i < numSubs; i++) {
> >         parts.push(template.raw[i]);
> >         const subst = subs[i] instanceof RegExp ?
>
>
>                `(?:${subs[i].source})` :
>
> >             subs[i].replace(/[\/\\^$*+?.()|[\]{}]/g, '\\amp;');
> >         parts.push(subst);
> >       }
> >       parts.push(template.raw[numSubs]);
> >       return RegExp(parts.join(''), flags);
> >     }
> >     if (typeof first === 'string') {
> >       return tag;
> >     } else {
> >       flags = void 0;  // Should this be '' ?
> >       return tag(first, ...args);
> >     }
> >   }
>



-- 
    Cheers,
    --MarkM
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20150613/33cb036b/attachment.html>


More information about the es-discuss mailing list