Machine readable specifications

Alex Russell slightlyoff at google.com
Fri Mar 22 05:26:51 PDT 2013


I expect that what you'll hear from implementers is that parsing isn't the
hard bit of a modern JS engine -- it's certainly not the thorniest part of
Traceur, and it doesn't do _most_ of the work a JIT-ing engine would.

If you would like to concretely improve the situation, you might ask Allen
and folks from engine teams what would simplify their jobs before proposing
a solution.

On Friday, March 22, 2013, gaz Heyes wrote:

> Hi all
>
> I'd like to discuss a radical change on how JS specifications and others
> are constructed. I suggest int based rules are used to define language
> behaviour. I know this works with last state and next state tracking and
> expected states too maybe other behaviour could be defined this way as
> well. The first part of the specification should define a list of ints that
> correspond to the state starting from 0 with 0 being a start state of
> Nothing. The initial states should be specified as follows:
>
> initalStates = {0:"Nothing",1:"FunctionExpression", 2:"FunctionStatement",
> 3:"FunctionStatementIdentifier" ...
>
> Once every state is defined as an int, each state can be used in a lookup
> table to determine the allowed last state and/or expected states. The
> lookup table is used for error messages by the parser to convert the
> corresponding int to human readable form. To define that FunctionStatement
> can follow Nothing but FunctionExpression can't follow nothing and
> FunctionStatementIdentifier is expected after FunctionStatement.
>
> rules = {
>         //FunctionStatement State
>          2:    {
>                    lastStates: {
>                                           //FunctionStatement is allowed
> after last state "Nothing"
>                                           0: true
>                                     },
>                    expectedStates: {
>                                           //FunctionStatement expects the
> next state to be FunctionStatementIdentifier
>                                           3:true
>                                    }
>                 }
> };
>
> "lastStates" and "expectedStates" would also be ints but I added the text
> for clarity, true could be shortened to 1 for compression. To use these
> rules the parser can simply check the state machine which was the last
> state and if the next state is valid. E.g.
> lastState = 0;//Last state was Nothing
> state = 2;//FunctionStatement state
> if(rules[state].lastStates[lastState]) {
>   // FunctionStatement is allowed to follow Nothing
> }
>
> lastState = 0;//Last state was Nothing
> state = 1;//FunctionExpression state
> if(!rules[state].lastStates[lastState]) {
>    //FunctionExpression is not allowed to follow Nothing
>    error();
> }
>
> The same technique could be used for expected states since you'd just need
> to lookup the next state with the assigned expected state from the last
> state.
> Once a machine readable specification has been done we could then generate
> a human readable form of it based on the rules and it could be checked that
> it conforms to what was intended. Both specifications (human readable and
> machine readable) could be used by the implementer, when a specification
> changes the parsers could automatically update based on the machine
> readable form. Emulation for older parsers could be added within the
> browser itself by using the new specification rules in a parser shim.
>
> Thoughts?
>
> Cheers
>
> Gareth
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20130322/d3ff83e1/attachment.html>


More information about the es-discuss mailing list