ES parsing tools (Re: Short Functions)
claus.reinke at talk21.com
Thu Jun 9 13:59:09 PDT 2011
> Hi Claus. Yes I would be very interested in your extensions. Thanks for
> the pointers below.
My code is now available on github (JS parser combinators and a
grammar edging closer to ES5, an AST with unparsing support; an
experiment beyond ES5 is included as an example, by translating
new syntax for paren-free function calls/definitions to old syntax):
Please consider this a semi-public preview, there is still much to do
and many of the comments are notes to self rather than user docs.
I took the liberty of adding most of a SpiderMonkey-style AST
(there are still some unwanted deviations, as well as a few items
needing clarification, so you wont be able to port your code just
yet). It turns out that this AST spec is biased towards evaluation,
so I needed to augment it to support proper unparsing.
As I said, I've got a long and growing TODO list, but if there are
any particular improvements/fixes that would help you or others
here, knowing about them would influence my priorities.
Thanks for your interest,
> From: Claus Reinke <claus.reinke at talk21.com>
> To: Kam Kasravi <kamkasravi at yahoo.com>
> Cc: es-discuss <es-discuss at mozilla.org>
> Sent: Wednesday, June 1, 2011 6:48 AM
> Subject: Re: ES parsing tools (Re: Short Functions)
> Hi Kam,
>> I've been experimenting with pegjs - which generates a parser based on
>> the ecma-262 5th edition grammar.
> yes, pegjs  and jsparse  were the alternatives I had been looking
> at. pegjs is probably a bit ahead when you just want to use the parse
> result; I chose to build on jsparse because it gives me more flexibility
> for grammar-level experimentation (no fixed barrier between grammar and
> grammar processor).
> pegjs comes with a fairly complete transliteration of the ES5 grammar,
> but you should give it a look-over for PEG vs LR issues (jsparse and
> ometa have similar issues): a PEG parser will commit to a choice
> alternative locally, even where the grammar was written with the
> implicit requirement of a 1 token lookahead for parsing.
> Typical example, in Statement:
> ExpressionStatement / LabelledStatement
> For input "id \n : 1", a PEG parser will commit to the first branch
> after successfully parsing an Identifier and inserting a semicolon,
> without ever checking whether the next token is ':' (this assumes
> the usual local and explicit implementation of ASI, instead of the
> implicit error correction with 1 token lookahead required by the
> ASI spec; which is global in that it depends on the whole grammar).
> Apart from fixing some such issues in jsparse, I've moved its grammar
> closer to ES5 (not complete, but can parse code like fulljslint.js or the
> ASI-reliant read-json.js  from node's npm, mentioned in an earlier
> thread here), added parse error messages and grammar debugging help
> (grammar rule tracing and rule stack record).
>> I've been building a backend that walks the ast to regenerate the code
>> (less white related formatting).
>> The nice thing about pegjs is most of the ast nodes agree with the parser
>> api (https://developer.mozilla.org/en/SpiderMonkey/Parser_API)
>> My approach is to add various strawmen and have them translated to plain
>> I've been very happy with the ast it generates thus far.
> Thanks for the reminder. There actually is a strawman for parser
> API standardization:
> Sadly, that isn't in Harmony yet (oversight, or decision?) but if
> everyone here agrees on the general format, I'd add it as an
> option to my parser. If would be good to agree on how to
> preserve whitespace and comments (essential for source to
> source transformations).
>  http://pegjs.majda.cz/
>  http://github.com/doublec/jsparse
> (my own extensions to Chris' work are not online yet,
> but I could make a snapshot available if there is interest)
>  https://github.com/isaacs/npm/blob/master/lib/utils/read-json.js
> (a useful ASI-reliant parser test case)
More information about the es-discuss