[rust-dev] LL(1) problems

Patrick Walton pwalton at mozilla.com
Wed Apr 24 19:06:42 PDT 2013


I've refactored the Rust grammar to get it as close to LL(1) as I could. 
I made some minor language changes along the way which should not break 
code, but there are a few things that are still preventing me from 
making it LL(1):

1. Explicit self and patterns-as-arguments require two tokens of 
lookahead to find the `self` (or `mut` or lifetime).

     fn foo(&self, ...) vs. fn foo(&a: &int) { ... }

This is because arguments are patterns, and `&a` is a valid pattern.

2. There is a production called "maybe-named argument", which is part of 
the trait declaration form. (It also shows up in function types.) This 
allows you to write:

     trait Foo {
         fn bar(&int, &int);
     }

Instead of:

     trait Foo {
         fn bar(x: &int, y: &int);
     }

However, because arguments can be patterns, lookahead is required to 
distinguish between that and this:

     trait Foo {
         fn bar(&x: &int, &y: int);
     }

I believe this can actually be *unbounded* lookahead. Consider:

     trait Foo {
         fn bar(&&&&&&&&&&x: &&&&&&&&&&int);
     }

Versus:

     trait Foo {
         fn bar(&&&&&&&&&&int);
     }

This has a relatively straightforward fix though: just restrict the 
argument name to be an identifier. There is little need for the pattern 
form here. This reduces the complexity to LL(2); I haven't investigated 
whether it can be reduced further.

3. `unsafe` blocks and `unsafe` function declarations are both allowed 
inside blocks, and therefore two tokens of lookahead are required.

     fn foo() {
         unsafe fn bar() { ... }
     }

Versus:

     fn foo() {
         unsafe { ... }
     }

The parser needs to look ahead two tokens to find the `fn` or `{` keyword.

These were the only warnings that the Python yapps2 module emitted when 
compiling the Rust grammar, after suitable refactoring. My general 
feeling is that, given how valuable explicit self and argument patterns 
are, LL(2) is okay, especially if we turn out to be LALR(1). Others may 
feel differently, however; thoughts?

Patrick


More information about the Rust-dev mailing list