May 24-26 rough meeting notes

Waldemar Horwat waldemar at google.com
Tue May 31 15:59:58 PDT 2011


On 05/31/11 14:55, Brendan Eich wrote:
> On May 31, 2011, at 2:30 PM, Waldemar Horwat wrote:
>
>> I would not want to use anything like a PEG to standardize a grammar.  Here's why:
>>
>> PEG being unambiguous by construction simply means that it resolves all ambiguities by picking the earliest rule.  This turns all rules following the first one into negative rules:  X matches Z only if it DOESN'T match a Y or a Q or a B or ....  You could pick the same strategy to disambiguate an LR(1) grammar, and it would be equally bad.
>>
>> Negative rules are the bane of grammars and behind the majority of the problems with the C++ grammar, including the examples I listed earlier.  They make a grammar non-understandable because the order of the rules is subtly significant and makes it hard to reason about when an X matches a Z; a language extension might expand the definition of Y to make an X no longer match a Q, and you wouldn't know it just by looking at a grammar with negative rules.  In a positive-rule-only grammar you'd discover the problem right away because the grammar wouldn't compile.
>
> Thanks -- you've made this point before and I've agreed with it. It helps to restate and amplify it, I think, because my impression is that not many people "get it".
>
> PEG users may be happy with their JS parsers at any given point in the language's standard version-space, of course.
>
> It still could be that we use LL(1) or another positive-rule-only grammar, of course, but we can hash that out separately.
>
>
>> Negative rules also interact badly with both semicolon insertion and division-vs-regexp lexer disambiguation.  One might naively think that semicolon insertion would be an ideal match for negative rules:  You first try to parse
>>
>>   tokens-on-line1
>>   tokens-on-line2
>>
>> as a single statement and, only if that fails, you move on to parsing it as two statements with a virtual semicolon between them.  That, however, doesn't work.  Here's a simple counterexample:
>>
>>   a + b
>>   (c) = d
>>
>> Negative rules would insert a virtual semicolon here because
>>
>>   a + b(c) = d
>>
>> is not a valid parse.  However, the correct ECMAScript behavior is not to insert a semicolon.
>
> Heh; this doesn't pass the first rule of ASI fight-club: there's no insertion is there is no error.

I don't understand the premise of your comment on ASI.  Here there *is* an error in parsing without a virtual semicolon and no error in parsing with a virtual semicolon, so a PEG-like ASI would erroneously insert one.

     Waldemar


More information about the es-discuss mailing list