Fwd: Re: Directive prologue members, escapes, and retroactive syntax errors

David Flanagan david at davidflanagan.com
Wed Oct 13 15:37:51 PDT 2010


Forwarding this to the whole list... I accidentally replied just to Jeff

-------- Original Message --------
Subject: Re: Directive prologue members, escapes, and retroactive syntax 
errors
Date: Wed, 13 Oct 2010 12:49:51 -0700
From: David Flanagan <david at davidflanagan.com>
To: Jeff Walden <jwalden+es at MIT.EDU>

IIRC, there was originally no directive prolog in the spec.  There was
just a use directive which could have any number of comma-separated?
keywords following it. E.g. "use strict, foo, whatever";  And I would
assume that the restriction on escape sequences was originally intended
to apply to the entire string not just the "use strict" part of it.  I
bet that the escape sequence restriction accidentally stayed with the
"use strict" part of the spec rather than moving to the Directive Prolog
part when the syntax was changed.

So you might be able to argue that this is not an incompatible change to
the spec but instead an erratum to be fixed.

	David

On 10/13/2010 12:09 PM, Jeff Walden wrote:
> The MS ES5 test suite has a test which boils down to this:
>
> function error()
> {
> "\145"; // ohai, octal escape
> "use strict"; // causes a RETROACTIVE SYNTAX ERROR
> }
>
> I've written a patch for SpiderMonkey to implement this according to the
> ES5 spec. It's ugly in that I have to dig into the scanner to propagate
> outward whether we saw an octal escape, where we never had to return
> anything but a token before (type, position, one of a very few kinds of
> type-specific data). I suspect my current approach can probably be
> improved, but even still, it doesn't seem like it should be necessary to
> add code specifically to handle this one edge case (one I would expect
> to see in a test suite, and nowhere else, ever) solely to report an error.
>
> Don't get me wrong -- this is an excellent, devious, and diabolical
> test. But with a very minimal modification to the definition of a
> Directive Prologue, it would not be necessary to specially track whether
> an octal escape has been seen in the Directive Prologue prior to a "use
> strict" directive. The current definition is this:
>
>> A Directive Prologue is the longest sequence of ExpressionStatement
>> productions occurring as the initial SourceElement productions of a
>> Program or FunctionBody and where each ExpressionStatement in the
>> sequence consists entirely of a StringLiteral token followed a
>> semicolon. The semicolon may appear explicitly or may be inserted
>> by automatic semicolon insertion. A Directive Prologue may be an
>> empty sequence.
>>
>> A Use Strict Directive is an ExpressionStatement in a Directive
>> Prologue whose StringLiteral is either the exact character
>> sequences "use strict" or 'use strict'. A Use Strict Directive may
>> not contain an EscapeSequence or LineContinuation.
>
> Suppose we added the further restriction that StringLiterals making up
> the Directive Prologue not contain an EscapeSequence or a
> LineContinuation (as is already the case for "use strict" directives).
> Then we wouldn't have to do any token look-behind (problematic for a
> streaming parser that throws source away as it constructs a parse tree)
> or octal-escape-flagging to check for retroactive syntax errors when we
> encounter a non-leading "use strict" directive. The modified text, then,
> would be something like this:
>
>> A Directive Prologue is the longest sequence of ExpressionStatement
>> productions occurring as the initial SourceElement productions of a
>> Program or FunctionBody, where each ExpressionStatement in the
>> sequence consists entirely of a StringLiteral token followed by a
>> semicolon, and each such StringLiteral token does not contain an
>> EscapeSequence or LineContinuation. The semicolon may appear
>> explicitly or may be inserted by automatic semicolon insertion. A
>> Directive Prologue may be an empty sequence.
>>
>> A Use Strict Directive is an ExpressionStatement in a Directive
>> Prologue whose StringLiteral is either the exact character
>> sequences "use strict" or 'use strict'.
>
> (Incidentally, I just noticed the spec text says "StringLiteral token
> followed a semicolon", accidentally omitting "by" -- so an erratum is
> necessary here even if we didn't adopt the above text, or some other
> change intended to smooth this rough spot.)
>
> Yes, this would be an incompatible change to how strict mode works. But
> as noted in past threads, no browsers have shipped with strict mode
> support yet, so no engines have implemented and shipped this
> restriction, and this seems like a pointlessly sharp corner case.
>
> Thoughts from other parser hackers out there?
>
> Jeff
> _______________________________________________
> es5-discuss mailing list
> es5-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es5-discuss
>



More information about the es5-discuss mailing list