ES parsing tools (Re: Short Functions)

Kam Kasravi kamkasravi at yahoo.com
Wed Jun 1 12:27:45 PDT 2011


Hi Claus. Yes I would be very interested in your extensions. Thanks for the pointers below.


________________________________
From: Claus Reinke <claus.reinke at talk21.com>
To: Kam Kasravi <kamkasravi at yahoo.com>
Cc: es-discuss <es-discuss at mozilla.org>
Sent: Wednesday, June 1, 2011 6:48 AM
Subject: Re: ES parsing tools (Re: Short Functions)


Hi Kam,

> I've been experimenting with pegjs - which generates a parser based on the ecma-262 5th edition grammar.

yes, pegjs [1] and jsparse [2] were the alternatives I had been looking
at. pegjs is probably a bit ahead when you just want to use the parse
result; I chose to build on jsparse because it gives me more flexibility
for grammar-level experimentation (no fixed barrier between grammar and grammar processor).

pegjs comes with a fairly complete transliteration of the ES5 grammar,
but you should give it a look-over for PEG vs LR issues (jsparse and
ometa have similar issues): a PEG parser will commit to a choice
alternative locally, even where the grammar was written with the
implicit requirement of a 1 token lookahead for parsing.

Typical example, in Statement: 
   ExpressionStatement / LabelledStatement

For input "id \n : 1", a PEG parser will commit to the first branch
after successfully parsing an Identifier and inserting a semicolon, without ever checking whether the next token is ':' (this assumes
the usual local and explicit implementation of ASI, instead of the
implicit error correction with 1 token lookahead required by the
ASI spec; which is global in that it depends on the whole grammar).

Apart from fixing some such issues in jsparse, I've moved its grammar closer to ES5 (not complete, but can parse code like fulljslint.js or the ASI-reliant read-json.js [3] from node's npm, mentioned in an earlier thread here), added parse error messages and grammar debugging help (grammar rule tracing and rule stack record).

> I've been building a backend that walks the ast to regenerate the code (less white related formatting).
> The nice thing about pegjs is most of the ast nodes agree with the parser api (https://developer.mozilla.org/en/SpiderMonkey/Parser_API)
> My approach is to add various strawmen and have them translated to plain javascript - transpiler approach.
> I've been very happy with the ast it generates thus far. 

Thanks for the reminder. There actually is a strawman for parser
API standardization: 
   http://wiki.ecmascript.org/doku.php?id=strawman:ast

Sadly, that isn't in Harmony yet (oversight, or decision?) but if
everyone here agrees on the general format, I'd add it as an
option to my parser. If would be good to agree on how to
preserve whitespace and comments (essential for source to
source transformations).

Claus

[1] http://pegjs.majda.cz/
[2] http://github.com/doublec/jsparse
   (my own extensions to Chris' work are not online yet,
   but I could make a snapshot available if there is interest)
[3] https://github.com/isaacs/npm/blob/master/lib/utils/read-json.js
   (a useful ASI-reliant parser test case)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20110601/859088f8/attachment.html>


More information about the es-discuss mailing list