An example of the third basic rule of semicolon insertion

Kang-Hao (Kenny) Lu kanghaol at oupeng.com
Sun Jan 20 07:47:36 PST 2013


(13/01/20 15:07), heathmatlock wrote:
> Substitute "chance" for "check" in the previous email.

(I couldn't find your previous mail either in my es-discuss folder or
the online archive, though I noticed that this mailing list drops more
mails than a usual mailing list at W3C.)

> On Sun, Jan 20, 2013 at 1:05 AM, heathmatlock <heathmatlock at gmail.com> wrote:
>> I was looking through the section on ASI again, and I'm unsure which
>> is a valid example of the third basic rule which states:
>>
>> [snip]
>>
>> I looked at the examples section, and the only statement that might be
>> valid for this rule is the return\n a + b example, but that seems more
>> like a chance against a line terminator which is what the first rule
>> does.
>>
>> Would anyone care to offer an example or explain how the example
>> mentioned above should be handled by the third rule?

I had very similar trouble understanding this rule too and I somehow
developed my own (fragile) explanation:

  # When, as the program is parsed from left to right, a token is
  # encountered that is allowed by some production of the grammar,

So there are two points here:

  1. A LineTerminator (and WhiteSpace/Comment) isn't a token in this
     sentence.

Note that the Token production doesn't include LineTerminator (but by
the same reasoning DivPuntuator is not a token either, which doesn't
seem right). Therefore, "the token encountered" in your example is "a".

  2. A production does not include the meta annotation(s), which
     are the "[no LineTerminator here]" and "[lookahead ∉ X]"s.

This seems to imply the "concept" that "ECMAScript has an ambiguous
formal grammar but there are meta annotations that disambiguate the
ambiguities."

So, here, "a token is encountered that is allowed by some production of
the grammar" is satisfied by the token "a" and the production
"ReturnStatement: return Expression ;".

  # but the production is a restricted production and the token would be
  # the first token for a terminal or nonterminal immediately following
  # the annotation '[no LineTerminator here]' within the restricted
  # production (and therefore such a token is called a restricted
  # token), and the restricted token is separated from the previous
  # token by at least one LineTerminator, then a semicolon is
  # automatically inserted before the restricted token.

An evidence of this interpretation is this prose in 5.1.2 of ES5:

  # Input elements other than white space and comments form the
  # terminal symbols for the syntactic grammar for ECMAScript and are
  # called ECMAScript /tokens/. These tokens are the reserved words,
  # identifiers, literals, and punctuators of the ECMAScript language.


In any case, I agree that this is quite confusing and perhaps it should
be re-clarified in chapter 7 that DivPunctuator and
RegularExpressionLiteral are /tokens/. They are just not Tokens.

Alternatively, I agree with Heath that this rule should get folded into
rule 1. Perhaps we should adopt the common believe that
LineTerminator/WhiteSpace/Comment are /tokens/.


Cheers,
Kenny
-- 
Web Specialist, Oupeng Browser, Beijing
Try Oupeng: http://www.oupeng.com/


More information about the es-discuss mailing list