ES RegExp parser
dmitry.soshnikov at gmail.com
Sun Mar 19 21:52:13 UTC 2017
I started working on a ECMAScript regular expressions parser with an AST
format similar to Mozilla's Parser API. This might later be extended to
support more powerful constructs, like lookbehind assertions, multiline
regexes, groups naming, comments, etc.
And while this is mostly an FYI post (probably someone will find it useful
for regexes analysis in source transformation tools, or source code
editors), I'd appreciate any feedback on the specification of AST nodes
(currently totally made up by myself).
E.g. when we have a quantifier from ES spec for RegExp grammar, it doesn't
tell anything (and shouldn't of course) which AST node this quantifier node
This leaves open questions like "whether a quantifier should be a part of
the parsed expression, or should it vice-versa be a main node itself, and
have the expression as a sub-node?":
In other words, which format is more appropriate (taking into account AST
traversal tools in order to implement NFA/DFA engine for it later):
Char is main, quantifier is a sub-node:
The quantifier is main (creating `Repetition` AST node), char is the
Currently I chose the second approach (with `Repetition` node) as more
practical when building an AST traversal -- it may have `onRepetition`
generic handler, and call `onChar` internally for its `expression`, instead
of making `onChar` (or any other node) to check, and handle its
`quantifier`, and do a repeat.
Anyways, if you have any thought or feedback on AST nodes format, please
feel free to contact me.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the es-discuss