RegExp.escape()

Mark Miller erights at gmail.com
Sat Jun 13 18:59:36 UTC 2015


Perhaps. I encourage you to draft a possible concrete proposal.


On Sat, Jun 13, 2015 at 11:30 AM, Jordan Harband <ljharb at gmail.com> wrote:

> Would it help subclassing to have the list of syntax characters/code
> points be on a well-known-symbol property? Like
> `RegExp.prototype[@@syntaxCharacters] =
> Object.freeze('^$\\.*+?()[]{}|'.split(''));` or something? Then @exec could
> reference that, and similarly `RegExp.escape` and RegExpSubclass.escape`
> could reference it as well?
>
> On Sat, Jun 13, 2015 at 11:07 AM, Mark S. Miller <erights at google.com>
> wrote:
>
>> On Sat, Jun 13, 2015 at 9:17 AM, Domenic Denicola <d at domenic.me> wrote:
>>
>>>  All of these should be building on top of RegExp.escape :P
>>>
>>
>> I am not yet agreeing or disagreeing with this. Were both to become std,
>> clearly they should be consistent with each other. At the time I wrote
>> this, it had not occurred to me that the tag itself might be stdized at the
>> same time as RegExp.escape. Now that this possibility has been proposed, I
>> am realizing lots of flaws with my polyfill. It's funny how, by considering
>> it as leading to a proposal, I quickly saw deep flaws that I was previously
>> missing.
>>
>> * The big one is that the literal template parts that are taken to
>> represent the regexp pattern fragments being expressed should be
>> syntactically valid *fragments*, in the sense that it makes semantic sense
>> to inject data between these fragments. Escaping the data + validating the
>> overall result does not do this. For example:
>>
>>     const data = ':x';
>>     const rebad = RegExp.tag`(?${data})`;
>>     console.log(rebad.test('x')); // true
>>
>> is nonsense. Since the RegExp grammar can be extended per platform, the
>> same argument that says we should have the platform provide RegExp.escape
>> says we should have the platform provide RegExp.tag -- so that they can
>> conisistently reflect these platform extensions.
>>
>> * Now that we have modules, I would like to see us stop having each
>> proposal for new functionality come at the price of further global
>> namespace pollution. I would like to see us transition towards having most
>> new std library entry points be provided by std modules. I understand why
>> we haven't yet, but something needs to go first.
>>
>> * ES6 made RegExp subclassable with most methods delegating to a common
>> @exec method, so that a subclass only needs to consistently override a
>> small number of things to stay consistent. Neither RegExpSubclass.escape
>> nor RegExpSubclass.tag can be derived from aRegExpSubclass[@exec]. Because
>> of the first bullet, RegExpSubclass.tag also cannot be derived from
>> RegExpSubclass.escape. But having RegExpSubclass.escape delegating to
>> RegExpSubclass.tag seem weird.
>>
>> * The instanceof below prevents this polyfill from working cross-frame.
>> Also, when doing RegExpSubclass1.tag`xx${aRegExpSubclass2}yy`, where
>> RegExpSubclass2.source produces a regexp grammar string that
>> RegExpSubclass1 does not understand, I have no idea what the composition
>> should do other than reject with an error. But what if the strings happen
>> to be mutually valid but with conflicting meaning between these subclasses?
>>
>>
>>
>>
>>>
>>>
>>> *From:* es-discuss [mailto:es-discuss-bounces at mozilla.org] *On Behalf
>>> Of *Mark S. Miller
>>> *Sent:* Saturday, June 13, 2015 02:39
>>> *To:* C. Scott Ananian
>>> *Cc:* Benjamin Gruenbaum; es-discuss
>>> *Subject:* Re: RegExp.escape()
>>>
>>>
>>>
>>> The point of this last variant is that data gets escaped but RegExp
>>> objects do not -- allowing you to compose RegExps:
>>> re`${re1}|${re2}*|${data}`
>>> But this requires one more adjustment:
>>>
>>>
>>> >
>>> >   function re(first, ...args) {
>>> >     let flags = first;
>>> >     function tag(template, ...subs) {
>>> >       const parts = [];
>>> >       const numSubs = subs.length;
>>> >       for (let i = 0; i < numSubs; i++) {
>>> >         parts.push(template.raw[i]);
>>> >         const subst = subs[i] instanceof RegExp ?
>>>
>>>
>>>                `(?:${subs[i].source})` :
>>>
>>> >             subs[i].replace(/[\/\\^$*+?.()|[\]{}]/g, '\\amp;');
>>> >         parts.push(subst);
>>> >       }
>>> >       parts.push(template.raw[numSubs]);
>>> >       return RegExp(parts.join(''), flags);
>>> >     }
>>> >     if (typeof first === 'string') {
>>> >       return tag;
>>> >     } else {
>>> >       flags = void 0;  // Should this be '' ?
>>> >       return tag(first, ...args);
>>> >     }
>>> >   }
>>>
>>
>>
>>
>> --
>>     Cheers,
>>     --MarkM
>>
>> _______________________________________________
>> es-discuss mailing list
>> es-discuss at mozilla.org
>> https://mail.mozilla.org/listinfo/es-discuss
>>
>>
>
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss
>
>


-- 
Text by me above is hereby placed in the public domain

  Cheers,
  --MarkM
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20150613/903361e8/attachment.html>


More information about the es-discuss mailing list