RegExp.escape()

Mark S. Miller erights at google.com
Sat Jun 13 18:57:56 UTC 2015


On Sat, Jun 13, 2015 at 11:39 AM, Benjamin Gruenbaum <benjamingr at gmail.com>
wrote:

> On Sat, Jun 13, 2015 at 9:07 PM, Mark S. Miller <erights at google.com>
> wrote:
>
>> On Sat, Jun 13, 2015 at 9:17 AM, Domenic Denicola <d at domenic.me> wrote:
>>
>>>  All of these should be building on top of RegExp.escape :P
>>>
>>
>> It's funny how, by considering it as leading to a proposal, I quickly saw
>> deep flaws that I was previously missing.
>>
>>
> That was a big part of making a proposal out of it - to find these things
> :)
>

Indeed! Much appreciated.



>
>
>> the overall result does not do this. For example:
>>
>>     const data = ':x';
>>     const rebad = RegExp.tag`(?${data})`;
>>     console.log(rebad.test('x')); // true
>>
>> is nonsense. Since the RegExp grammar can be extended per platform, the
>> same argument that says we should have the platform provide RegExp.escape
>> says we should have the platform provide RegExp.tag -- so that they can
>> conisistently reflect these platform extensions.
>>
>>
> This is a good point, I considered whether or not `-` should be included
> for a similar reason. I think it is reasonable to only include syntax
> identifiers and expect users to deal with parts of patterns of more than
> one characters themselves (by wrapping the string with `()` in the
> constructor). This is what every other language does practically.
>
> That said - I'm very open to allowing implementations to escape _more_
> than `SyntaxCharacter` in their implementations and to even recommend  that
> they do so in such a way that is consistent with their regular expressions.
> What do you think about doing that?
>
> I'm also open to `.tag` wrapping with `()` to avoid these issues but I'm
> not sure if we have a way in JavaScript to not make a capturing group out
> of it.
>

Better or different escaping is not issue of this first bullet, but rather,
validating that a fragment is a valid fragment for that regexp grammar. For
the std grammar, "(?" is not a valid fragment and the tag should have
rejected the template with an error on that basis alone.




>
>
>> * Now that we have modules, I would like to see us stop having each
>> proposal for new functionality come at the price of further global
>> namespace pollution. I would like to see us transition towards having most
>> new std library entry points be provided by std modules. I understand why
>> we haven't yet, but something needs to go first.
>>
>>
> I think that doing this should be an eventual target but I don't think
> adding a single much-asked-for static function to the RegExp function would
> be a good place to start. I think the committee first needs to agree about
> how this form of modularisation should be done - there are much bigger
> targets first and I would not like to see this proposal tied and held back
> by that (useful) goal.
>

I agree, but this will be true for any individual proposal.

Perhaps we need a sacrificial "first penguin through the ice" proposal
whose *only* purpose is to arrive as a std import rather than a std
primordial.
(Just kidding.)


>
>
>> * ES6 made RegExp subclassable with most methods delegating to a common
>> @exec method, so that a subclass only needs to consistently override a
>> small number of things to stay consistent. Neither RegExpSubclass.escape
>> nor RegExpSubclass.tag can be derived from aRegExpSubclass[@exec]. Because
>> of the first bullet, RegExpSubclass.tag also cannot be derived from
>> RegExpSubclass.escape. But having RegExpSubclass.escape delegating to
>> RegExpSubclass.tag seem weird.
>>
>>
> Right but it makes sense that `escape` does not play in this game since it
> is a static method that takes a string argument - I'm not sure how it could
> use @exec.
>

I agree that defining a class-side method to delegate to an instance-side
method is unpleasant. But because we have class-side inheritance, static
methods should be designed with this larger game in mind.



>
>
>> * The instanceof below prevents this polyfill from working cross-frame.
>> Also, when doing RegExpSubclass1.tag`xx${aRegExpSubclass2}yy`, where
>> RegExpSubclass2.source produces a regexp grammar string that
>> RegExpSubclass1 does not understand, I have no idea what the composition
>> should do other than reject with an error. But what if the strings happen
>> to be mutually valid but with conflicting meaning between these subclasses?
>>
>> This is hacky, but in my code I just did `argument.exec ? treatAsRegExp :
> treatAsString`.
>

Yes, as with instanceof, that's the difference between the quality needed
in a polyfill for personal use vs a proposed std.



-- 
    Cheers,
    --MarkM
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20150613/4e0010ce/attachment-0001.html>


More information about the es-discuss mailing list