That hash symbol

Brendan Eich brendan at mozilla.com
Tue Mar 29 16:22:40 PDT 2011


On Mar 29, 2011, at 3:30 PM, Bob Nystrom wrote:

> C#, CoffeeScript, and other languages use -> to link a formal parameter list to a function body, which requires bottom-up parsing in general (with comma as operator, as JS, C++, and C have; plus Harmony's destructuring and default parameter value proposals).
> 
> I'm not a parsing expert, but isn't destructuring just as hard to parse top-down as => for functions would be? Given:
> 
>     { a: b, c: d } =
> 
> A top-down parser will go up to "=" thinking its parsing an object literal. Then it hits the "=" and have to either backtrack, or just transform the object literal AST into a destructuring pattern.

That's exactly what we do in SpiderMonkey, and IIRC Rhino does the same.

My point was about parsing, not parsing + some retrospective procedure on the AST that rewrites it. That latter is not just "parsing", it's a separate pass and not formalized in the ECMA-262 specs currently. More below.


> Wouldn't => work the same way?
> 
>     (a, b) =>
> 
> It parses "(a, b)" thinking it's a grouped comma operator (not exactly a common expression FWIW), then it hits "=>" realizes it's a function parameter decl, and then either backtracks or just transforms the left-hand AST into a param decl.

It ups the ante beyond "pure parsing", but yes, in the same way as destructuring.

One example of the cost of this ante: Harmony wants early error for assignment that would create a global variable, or use of an identifier that is not declared via a lexical binding form. These would have to come in a later pass, or be deferred manually till closing ) without immediately following arrow were parsed.

There is no absolute top-down-parsing-must-be-"easy" requirement, and indeed the formal grammar is LR(1), so we need to validate each edition that way -- via a bottom-up grammar and even automated checker (modulo ASI, which is separable and treated separately).

Waldemar has shown how using only top-down parsing, however formalized, without bottom-up grammatical validation can lead one astray:

https://mail.mozilla.org/pipermail/es-discuss/2008-October/007883.html

Something like the reverse, bottom-up validation without top-down being "easy enough", could also be a problem, since none of the major engines (AFAIK) uses bottom-up parsing.


> I understand this list isn't "teach me the details of the JS grammar", but it isn't obvious to me why an infix function syntax is any harder than destructuring as far as parsing performance is concerned.

I agree, and I said so at https://gist.github.com/888867#comments and here on the list:

https://mail.mozilla.org/pipermail/es-discuss/2011-March/013462.html

which seems to be right before your post in the archive at https://mail.mozilla.org/pipermail/es-discuss/2011-March/thread.html.

So yes, we can certainly consider infix-arrow, but it's more work for top-down parser implementors than leading octothorp or equivalent prefix. Destructuring requires similar but less work. That may not be enough to justify infix-arrow.

As usual, the big question for the future of the standard language is: do TC39 members -- in particular the parser implementors at Apple, Google, Microsoft, Mozilla, and Opera -- all agree?

We have already approved destructuring for ES.next -- it's in the harmony:proposals part of the wiki. But it's not the same in degree of work, even if same in kind measured roughly.


> Empirically, I'd expect it to be less of an issue because the comma operator is so rare and parameter declarations tend to be short. Is it because there are things that would be valid in a parameter declaration that are *not* valid expressions?

Possibly, although if we use | to separate the optional formal receiver parameter declaration from the positional parameters, then we're still ok: (t = u | a, b, c) is a wacky expression: ((t = (u | a)), a, b, c) -- comma with assignment of bitwise-or as first comma-linked operand.

Again the detailed cost analysis shows pain due to precedence shifting. Rewriting this AST in a top-down parser to have a shape more like ((t = u), a, b, c) or, ot label nodes functionally and with quoting elided, formals(opt_this_formal(t, u), a, b, c), might engender some strong complaints from implementors!

/be


> 
> - bob
>  
> 
> Requiring bottom-up parsing has bounced off of implementors in the past, and with JavaScriptCore switching from a Bison grammar to a top-down hand-coded parser, I expect it will again.
> 
> 
>> I don't find syntax like this clear from a coder's POV, and there is the re-tooling issue with highlighting editors and the ability to trivially transform between the styles for faster adoption and old code minification -- while these issues certainly shouldn't be deciding factors for TC39 it is nice that leading-char lparen...rparen makes most of them go away.
> 
> That's the idea. We need to keep this simple or it will probably fall apart, either due to ambiguities, or implementors balking at too much complexity in parsing with more power than top-down parsers have.
> 
> /be
> 
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss
> 
> 
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20110329/d8ed2b3d/attachment.html>


More information about the es-discuss mailing list