Nested Quasis

Mark S. Miller erights at google.com
Thu Feb 2 11:03:18 PST 2012


On Thu, Feb 2, 2012 at 5:09 AM, Douglas Crockford <douglas at crockford.com>wrote:

> On 11:59 AM, Waldemar Horwat wrote:
>
>> On 02/01/2012 11:35 AM, Allen Wirfs-Brock wrote:
>> Here's one which I couldn't express in a lexer grammar: How to restart
>> the quasi after an included expression is over.
>>
>
> If quasis are not nested, then the lexical rule is really simple: Just
> match the `s, and within the literal, match the {}s.
>
> I would prefer to keep it simple, unless there is a compelling requirement
> to provide nesting. If we do the simple version now, we could allow the
> nested case in the future.
>

When we came up with this "simplification", I thought I could live with it.
Now, having tried to write some examples within these restrictions, I find
it unusable.

I think we're overestimating the parsing difficulty. I'll let Mike speak
for the real plan. But I'd like to explain what I do in E, so that we can
see that none of this need be complicated. It does involve an interaction
between the parsing and lexing levels, but much less complex than you may
expect, and comparable (IMO less) than the existing unclean interaction
that JS already has:

Lexing grammar has four new token types.

    QuasiOnly ::

        ` QuasiChar* `

    QuasiOpen ::

        ` QuasiChar* $

    QuasiMiddle ::

        QuasiChar*

    QuasiEnd ::

        QuasiChar `

Parsing grammar:

    quasiExpr :

        Identifier? quasiExprLiteral

    quasiExprLiteral :

        QuasiOnly

        QuasiOpen quasiHole (QuasiMiddle quasiHole)* QuasiClose

    quasiHole :

        Identifier

        curlyBalancedTokenSequence

    curlyBalancedTokenSequence :

        { expr }

The key thing is that the curlyBalancedTokenSequence starts a normal
lexical expression context and counts curlies. When it sees a "}"
*token*that matches its opening "{", the curlyBalancedTokenSequence is
done, and
we proceed to continue lexing QuasiChar* until we've lexed a QuasiMiddle or
QuasiEnd.

Of course, if you don't need to keep you parser and lexer so strongly
separated, you can just use the above grammar directly as a one-level
grammar, where you use the full expression parser after the "{". This is
what I did the first time in E. Either way works. The reason I changed to
the looser coupling is so that I could fully lex a program that didn't
parse, so I could give more informative error messages.

-- 
    Cheers,
    --MarkM
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20120202/7a9be7f4/attachment-0001.html>


More information about the es-discuss mailing list