Anda di halaman 1dari 12

Outline

Introduction to Bottom-Up Parsing Lecture Notes by Profs. Alex Aiken and George Necula (UCB)

The strategy: shift-reduce parsing A key concept: handles Ambiguity and precedence declarations

CS780(Prasad)

L12BUP

CS780(Prasad)

L12BUP

Predictive Parsing Summary


First and Follow sets are used to construct predictive tables For non-terminal A and input t, use a production A where t First() For non-terminal A and input t, if t Follow() and

Bottom-Up Parsing Bottom-up parsing is more general than topdown parsing.


Dont need left-factored grammars. Left recursion fine. Just as efficient. Builds on ideas in top-down parsing.

First(A), use a production A where


First()

Bottom-up parsing is the preferred method in practice.


Automatic parser generators: YACC, Bison,
3 CS780(Prasad) L12BUP 4

CS780(Prasad)

L12BUP

An Introductory Example

The Idea Bottom-up parsing reduces a string to the start symbol by inverting productions:
int * int + int int * T + int T + int T+T T+E E
5 CS780(Prasad)

Revert to the natural grammar for our example:


ET+E|T T int * T | int | (E)

T int T int * T T int ET ET+E

Consider the string:

int * int + int

CS780(Prasad)

L12BUP

L12BUP

Observation Read the sequence of productions in reverse (from bottom to top) This is a rightmost derivation!
int * int + int int * T + int T + int T+T T+E E
CS780(Prasad)

Important Fact #1

Important Fact #1 about bottom-up parsing:

T int T int * T T int ET ET+E


L12BUP 7

A bottom-up parser traces a rightmost derivation in reverse. LR-parser

CS780(Prasad)

L12BUP

A Bottom-up Parse
int * int + int int * T + int T + int T+T T+E E

A Bottom-up Parse in Detail (1) E T T int * int + E T int


9 CS780(Prasad)

int * int + int

int
L12BUP

int

int
10

CS780(Prasad)

L12BUP

A Bottom-up Parse in Detail (2)


int * int + int int * T + int

A Bottom-up Parse in Detail (3)


int * int + int int * T + int T + int

T T

T int
CS780(Prasad) L12BUP

int

int
11 CS780(Prasad)

int
L12BUP

int

int
12

A Bottom-up Parse in Detail (4)


int * int + int int * T + int T + int T+T

A Bottom-up Parse in Detail (5)


int * int + int

T T int * int + T int


13

int * T + int T + int T+T T+E

T T int * int +

E T int
14

CS780(Prasad)

L12BUP

CS780(Prasad)

L12BUP

A Bottom-up Parse in Detail (6)


int * int + int int * T + int T + int T+T T+E E

A Trivial Bottom-Up Parsing Algorithm E Let I = input string repeat pick a non-empty substring of I where X is a production if no such , backtrack replace one by X in I until I = S (the start symbol) or all possibilities are exhausted
CS780(Prasad) L12BUP 16

T T int * int +

E T int
15

CS780(Prasad)

L12BUP

Questions Does this algorithm terminate? How fast is the algorithm? Does the algorithm deal with all cases? How do we choose the substring to reduce at each step?
CS780(Prasad) L12BUP 17

Where Do Reductions Happen Important Fact #1 has an interesting consequence:


Let be a step of a bottom-up parse. Assume the next reduction is by X . Then is a string of terminals.

Why? Because X is a step in a rightmost derivation.

CS780(Prasad)

L12BUP

18

Notation Idea: Split string into two substrings.


Right substring is as yet unexamined by parser (hence is a string of terminals). Left substring has terminals and non-terminals.

Shift-Reduce Parsing
Bottom-up parsing uses only two kinds of actions:

Shift: Move | one place to the right.


Shifts a terminal to the left string

ABC|xyz ABCx|yz

The dividing point is marked by a |


The | is not part of the string.

Reduce: Apply an inverse production at the right end of the left string.
If A xy is a production, then

Initially, all input is unexamined.


CS780(Prasad) L12BUP

|x1x2 . . . xn
19 CS780(Prasad)

Cbxy|ijk CbA|ijk
L12BUP 20

The Example with Reductions Only

The Example with Shift-Reduce Parsing


|int * int + int shift shift shift reduce reduce shift shift reduce reduce int | * int + int int * | int + int int * int | + int int * T | + int T | + int T + | int T + int | T+T| T+E| E|

int * int | + int int * T | + int

reduce T int reduce T int * T

T int T int * T

T + int | T+T| T+E| E|


CS780(Prasad)

reduce T int reduce E T reduce E T + E

T int ET

reduce E T + E
22

L12BUP

21

A Shift-Reduce Parse in Detail (1)


|int * int + int

A Shift-Reduce Parse in Detail (2)


|int * int + int int | * int + int

int

CS780(Prasad) L12BUP

int

int
23 CS780(Prasad)

int

L12BUP

int

int
24

A Shift-Reduce Parse in Detail (3)


|int * int + int int | * int + int int * | int + int

A Shift-Reduce Parse in Detail (4)


|int * int + int int | * int + int int * | int + int int * int | + int

int
CS780(Prasad) L12BUP

int

int
25 CS780(Prasad)

int
L12BUP

int

int
26

A Shift-Reduce Parse in Detail (5)


|int * int + int int | * int + int int * | int + int int * int | + int int * T | + int

A Shift-Reduce Parse in Detail (6)


|int * int + int int | * int + int int * | int + int int * int | + int int * T | + int T | + int

T T

T int
CS780(Prasad) L12BUP

int

int
27 CS780(Prasad)

int
L12BUP

int

int
28

A Shift-Reduce Parse in Detail (7)


|int * int + int int | * int + int int * | int + int int * int | + int int * T | + int T | + int T + | int

A Shift-Reduce Parse in Detail (8)


|int * int + int int | * int + int int * | int + int int * int | + int int * T | + int T | + int T + | int T + int |

T T int * int +

29

T T int * int + int

30

int
CS780(Prasad)

CS780(Prasad)

L12BUP

L12BUP

A Shift-Reduce Parse in Detail (9)


|int * int + int int | * int + int int * | int + int int * int | + int int * T | + int T | + int T + | int T + int | T+T|

A Shift-Reduce Parse in Detail (10)


|int * int + int int | * int + int int * | int + int int * int | + int int * T | + int T | + int T + | int T + int | T+T|
31

T T int
L12BUP

T T int
L12BUP

E T + int

32

T + int

int

T+E|
CS780(Prasad)

int

CS780(Prasad)

A Shift-Reduce Parse in Detail (11)


|int * int + int int | * int + int int * | int + int int * int | + int int * T | + int T | + int T + | int T + int | T+T| T+E| E|

The Stack Left string can be implemented by a stack


Top of the stack is the |

E T T int
L12BUP

E T + int

33

Shift pushes a terminal on the stack. Reduce pops 0 or more symbols off the stack (production rhs) and pushes a non-terminal on the stack (production lhs).
CS780(Prasad) L12BUP 34

int

Shift-Reduce Parser
Parser Action

Key Issue How do we decide when to shift or reduce?


Consider step int | * int + int We could reduce by T int giving T | * int + int A fatal mistake: Because there is no way to reduce to the start symbol E.
ET+E|T T int * T | int | (E)
CS780(Prasad) L12BUP 36

stack

Stack

Parser Engine Current Symbol

Handles Intuition: Want to reduce only if the result can still be reduced to the start symbol. Assume a rightmost derivation: S =>* X Then is a handle of .

Handles (Cont.) A handle is a string that can be reduced, and that also allows further reductions back to the start symbol. We only want to reduce at handles. Note: We have said what a handle is, not how to find handles.
37 CS780(Prasad) L12BUP 38

CS780(Prasad)

L12BUP

Important Fact #2 Important Fact #2 about bottom-up parsing:

Why? Informal induction on # of reduce moves: True initially, stack is empty Immediately after reducing a handle
right-most non-terminal on top of the stack. next handle must be to right of right-most nonterminal, because this is a right-most derivation. Sequence of shift moves reaches next handle.

In shift-reduce parsing, handles appear only at the top of the stack, never inside.

CS780(Prasad)

L12BUP

39

CS780(Prasad)

L12BUP

40

Summary of Handles In shift-reduce parsing, handles always appear at the top of the stack. Handles are never to the left of the rightmost non-terminal.
Therefore, shift-reduce moves are sufficient; the | need never move left.

Conflicts Generic shift-reduce strategy:


If there is a handle on top of the stack, reduce Otherwise, shift

But what if there is a choice?

Bottom-up parsing algorithms are based on recognizing handles.


CS780(Prasad) L12BUP 41

If it is legal to shift or reduce, there is a shiftreduce conflict. If it is legal to reduce by two different productions, there is a reduce-reduce conflict.
L12BUP 42

CS780(Prasad)

Source of Conflicts Ambiguous grammars always cause conflicts. But beware, so do many non-ambiguous grammars. Consider our favorite ambiguous grammar:
E | | |
CS780(Prasad) L12BUP

One Shift-Reduce Parse


|int * int + int ... E * E | + int E | + int E + | int E + int| E+E| E| shift ... reduce E E * E shift shift reduce E int reduce E E + E

E+E E*E (E) int


43

CS780(Prasad)

L12BUP

44

Another Shift-Reduce Parse


|int * int + int ... E * E | + int E * E + | int E * E + int | E * E + E| E*E| E| shift ... shift shift reduce E int reduce E E + E reduce E E * E

Example Notes In the second step E * E | + int, we can either shift or reduce by E E * E. Choice determines associativity and precedence of + and *. As noted previously, grammar can be rewritten to enforce precedence. Precedence declarations are an alternative.
45 CS780(Prasad) L12BUP 46

CS780(Prasad)

L12BUP

Precedence Declarations Revisited Precedence declarations cause shift-reduce parsers to resolve conflicts in certain ways. Declaring * has greater precedence than + causes parser to reduce at E * E | + int . More precisely, precedence declaration is used to resolve conflict between reducing a * and shifting a +
CS780(Prasad) L12BUP 47