LR(1) grammar/parser and lookahead-restrictions
waldemar at google.com
Tue Feb 7 23:02:15 UTC 2017
On 02/07/2017 06:39, Michael Dyck wrote:
> On 17-02-06 07:32 PM, Waldemar Horwat wrote:
>> On 02/04/2017 07:20, Michael Dyck wrote:
>>> On 17-02-03 05:32 PM, Waldemar Horwat wrote:
>>>> On 02/03/2017 08:17, Michael Dyck wrote:
>>>>> On 17-02-02 06:23 PM, Waldemar Horwat wrote:
>>>>>> Lookahead restrictions fit very well into an LR(1) engine
>>>>> Again: Great, but how? E.g., do you pre-process the grammar, modify the
>>>>> construction of the automaton, and/or modify the operation of the parser?
>>>> For each state × token combination, the automaton describes what happens
>>>> when you're in state S and see a token T. The lookahead restrictions remove
>>>> possible transitions; without them there would be ambiguities where a given
>>>> state × token combination would want to do two incompatible things.
>>> Okay, so do you generate the automaton (ignoring lookahead restrictions)
>>> and then remove transitions (using lookahead restrictions)? Or do you
>>> integrate the lookahead-restrictions into the generation of the automaton?
>> It's integrated. You can't generate a valid automaton without the lookahead
> So, when you're making LR items for a production with a lookahead-restriction (l-r), do you:
> -- treat the l-r as a (nullable) symbol, so that you get an item with the dot before the l-r, and one with the dot after, or
> -- treat the l-r as occurring between two adjacent symbols, so that you get an item where the dot is "at" the l-r, or
> -- something else?
It's something else — it's directly tied into the generation of the automaton states. Each automaton state contains a collection of possible places in the expansions of grammar rules that the state can represent. Following a terminal symbol T from a state A leads to a state B. A lookahead restriction prevents state B's collection from including expansions that would have been prohibited by the lookahead restriction on T. If that generates an inconsistency (for example, if there are two ways to get to an identical state, one with a lookahead restriction and one without), the grammar validation fails.
More information about the es-discuss