ES RegExp parser

Dmitry Soshnikov dmitry.soshnikov at gmail.com
Mon Mar 20 23:40:45 UTC 2017


On Mon, Mar 20, 2017 at 9:19 AM, Dmitry Soshnikov <
dmitry.soshnikov at gmail.com> wrote:

>
> On Mon, Mar 20, 2017 at 8:36 AM Jason Orendorff <jason.orendorff at gmail.com>
> wrote:
>
>> The second approach, hands down.
>>
>> 1. With the first approach, you're setting up a situation where it's very
>> easy to write buggy analysis code: if you forget to check `re.quantifier`
>> anywhere, your code will run, but you have a bug. Much easier to only have
>> to check `re.type`.
>>
>> 2. If you have a regexp `re` and you want to programmatically build a
>> regexp that matches one or more repetitions of it, it's much easier to
>> write `{type: '+', expression: re}` than to have to examine `re.quantifier`
>> and (if it's already present) figure out how to modify it.
>>
>> 3. With the first approach, you don't have to represent `(?:)` group in
>> the AST at all (rather like how Esprima drops redundant parentheses). With
>> the latter, I think you have to, because it's possible for a single regexp
>> "node" to have multiple quantifiers: `/(?:\d{4,6})?/`
>>
>> To me this is not even a question.
>>
>
> Jason, thanks; all good points! And I came to similar conclusions while
> was experimenting. Wanted to double-check, thanks for confirming.
>
>
OK, I added docs and specs for AST node types, which can also be a good
learning material:
https://www.npmjs.com/package/regexp-tree#ast-nodes-specification

Any feedback is welcome!

Parsing regexes is fun :) with this you realize that these are completely
valid regexp patterns:

```
/^^^^$$$/.test(''); // true

/$^/.test(''); // true

/[---]/.test('-'); // true, a range from '-' to '-'!
/[-]/.test('-'); // true
```

Dmitry
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20170320/6fdd1d4f/attachment.html>


More information about the es-discuss mailing list