An example of the third basic rule of semicolon insertion

heathmatlock heathmatlock at gmail.com
Sun Jan 20 20:44:53 PST 2013


V8 and spidermonkey both ignore the third rule:
https://gist.github.com/4583287


On Sun, Jan 20, 2013 at 9:47 AM, Kang-Hao (Kenny) Lu
<kanghaol at oupeng.com> wrote:
> (13/01/20 15:07), heathmatlock wrote:
>> Substitute "chance" for "check" in the previous email.
>
> (I couldn't find your previous mail either in my es-discuss folder or
> the online archive, though I noticed that this mailing list drops more
> mails than a usual mailing list at W3C.)
>
>> On Sun, Jan 20, 2013 at 1:05 AM, heathmatlock <heathmatlock at gmail.com> wrote:
>>> I was looking through the section on ASI again, and I'm unsure which
>>> is a valid example of the third basic rule which states:
>>>
>>> [snip]
>>>
>>> I looked at the examples section, and the only statement that might be
>>> valid for this rule is the return\n a + b example, but that seems more
>>> like a chance against a line terminator which is what the first rule
>>> does.
>>>
>>> Would anyone care to offer an example or explain how the example
>>> mentioned above should be handled by the third rule?
>
> I had very similar trouble understanding this rule too and I somehow
> developed my own (fragile) explanation:
>
>   # When, as the program is parsed from left to right, a token is
>   # encountered that is allowed by some production of the grammar,
>
> So there are two points here:
>
>   1. A LineTerminator (and WhiteSpace/Comment) isn't a token in this
>      sentence.
>
> Note that the Token production doesn't include LineTerminator (but by
> the same reasoning DivPuntuator is not a token either, which doesn't
> seem right). Therefore, "the token encountered" in your example is "a".
>
>   2. A production does not include the meta annotation(s), which
>      are the "[no LineTerminator here]" and "[lookahead ∉ X]"s.
>
> This seems to imply the "concept" that "ECMAScript has an ambiguous
> formal grammar but there are meta annotations that disambiguate the
> ambiguities."
>
> So, here, "a token is encountered that is allowed by some production of
> the grammar" is satisfied by the token "a" and the production
> "ReturnStatement: return Expression ;".
>
>   # but the production is a restricted production and the token would be
>   # the first token for a terminal or nonterminal immediately following
>   # the annotation '[no LineTerminator here]' within the restricted
>   # production (and therefore such a token is called a restricted
>   # token), and the restricted token is separated from the previous
>   # token by at least one LineTerminator, then a semicolon is
>   # automatically inserted before the restricted token.
>
> An evidence of this interpretation is this prose in 5.1.2 of ES5:
>
>   # Input elements other than white space and comments form the
>   # terminal symbols for the syntactic grammar for ECMAScript and are
>   # called ECMAScript /tokens/. These tokens are the reserved words,
>   # identifiers, literals, and punctuators of the ECMAScript language.
>
>
> In any case, I agree that this is quite confusing and perhaps it should
> be re-clarified in chapter 7 that DivPunctuator and
> RegularExpressionLiteral are /tokens/. They are just not Tokens.
>
> Alternatively, I agree with Heath that this rule should get folded into
> rule 1. Perhaps we should adopt the common believe that
> LineTerminator/WhiteSpace/Comment are /tokens/.
>
>
> Cheers,
> Kenny
> --
> Web Specialist, Oupeng Browser, Beijing
> Try Oupeng: http://www.oupeng.com/



--
Heath Matlock
+1 256 274 4225


More information about the es-discuss mailing list