Bottom-Up Parsing CS308 Compiler Theory
Bottom-Up Parsing CS308 Compiler Theory 1
Bottom-Up Parsing A bottom-up parser creates the parse tree of the given input starting from leaves towards the root. A bottom-up parser tries to find the right-most derivation of the given input in the reverse order. S→..→o(the right--most derivation of) (the bottom-up parser finds the right-most derivation in the reverse order) Bottom-up parsing is also known as shift-reduce parsing because its two main actions are shift and reduce. -At each shift action,the current symbol in the input string is pushed to a stack. At each reduction step,the symbols at the top of the stack(this symbol sequence is the right side of a production)will replaced by the non-terminal at the left side of that production. There are also two more actions:accept and error. CS308 Compiler Theory 2
Bottom-Up Parsing • A bottom-up parser creates the parse tree of the given input starting f l dh rom leaves towar ds t he root. • A bottom-up parser tries to find the right-most derivation of the given i t i th d inpu t in the reverse or der. S ⇒ ... ⇒ ω (the right-most derivation of ω ) ← (the bottom-upp g arser finds the ri ght-most derivation in the reverse order ) • Bottom-up parsing is also known as shift-reduce parsing because its two main actions are shift and reduce. – At each shift action, the current symbol in the input string is pushed to a stack. – At each reduction step, the symbols at the top of the stack (this symbol sequence is the right side of a production) roductio n ) will replaced eplaced by the non-terminal at the left side of that production. roductio n. – There are also two more actions: accept and error. CS308 Compiler Theory 2
Shift-Reduce Parsing A shift-reduce parser tries to reduce the given input string into the starting symbol. a string>the starting symbol reduced to 。 At each reduction step,a substring of the input matching to the right side of a production rule is replaced by the non-terminal at the left side of that production rule. If the substring is chosen correctly,the right most derivation of that string is created in the reverse order. Rightmost Derivation: Shift-Reduce Parser finds: CS308 Compiler Theory 3
Shift-Reduce Parsing • A shift-reduce parser tries to reduce the given input string into the starting symbol. a string Î the starting symbol reduced to • At each reduction step, a substring of the input matching to the right side of a production rule is replaced by the non production rule is replaced by the non -terminal at the left side of that production rule. terminal at the left side of that production rule. • If the substring is chosen correctly, the right most derivation of that string is created in the reverse order. Rightmost Derivation: S ⇒ ω *rm Shift-Reduce Parser finds: ω ⇐ ... ⇐ S rm rm CS308 Compiler Theory 3
Shift-Reduce Parsing--Example S→aABb input string:aaabb A->aA a aaAbb B→bBIb aAbb reduction aABb S SaaBb aAbb aaAbb aaabb Right Sentential Forms How do we know which substring to be replaced at each reduction step? CS308 Compiler Theory
Shift-Reduce Parsing -- Example S → aABb input string: aa abb A → aA | a aaAbb B → bB | b aA b b ⇓ reduction aABb S S ⇒ aABb ⇒ aA bb ⇒ aaAbb ⇒ aa abb rm rm rm rm Right Sentential Forms • How do we know which substring to be replaced at each reduction step? CS308 Compiler Theory 4
Handle Informally,a handle of a string is a substring that matches the right side of a production rule. But not every substring matches the right side of a production rule is handle A handle of a right sentential form y (=aBo)is a production rule a-→βand a position ofy where the string B may be found and replaced by A to produce the previous right-sentential form in a rightmost derivation ofy. S台0A0三0β0 rm rm If the grammar is unambiguous,then every right-sentential form of the grammar has exactly one handle. We will see that o is a string of terminals. CS308 Compiler Theory 5
Handle • Informally, a handle of a string is a substring that matches the right side of a prod ction r le of a production rule. – But not every substring matches the right side of a production rule is handle • A h dl an e of i h i lf f a right sentential form γ (≡ αβω) is a production rule A → β and a position of γ where the string where the string β may be found and replaced by A to produce may be found and replaced by A to produce the previous right-sentential form in a rightmost derivation of γ. S ⇒ αAω ⇒ αβω If th i bi th i ht t ti l f f th rm rm * • If the grammar is unambiguous, then every right-sentential form of the grammar has exactly one handle. • We will see that ω is a string of terminals CS308 Compiler Theory 5 We will see that ω is a string of terminals
Handle Pruning A right-most derivation in reverse can be obtained by handle-pruning. S=Y0品Y1Y2…量Yn-1品Yn一0in input string 。 Start from yn find a handle An>Bn in Yn> and replace Bn in by An to get Yn-1. 。Then find a handle An-.l→pn-I in Yn-l and replace Bn-i in by An-i to get Yn-2 Repeat this,until we reach S. CS308 Compiler Theory 6
Handle Pruning • A right-most derivation in reverse can be obtained by handle-pruning. S=γ0 ⇒ γ1 ⇒ γ2 ⇒ ... ⇒ γn-1 ⇒ γ rm rm rm rm rm n=ω input string • Start from γn, find a handle An→βn in γn, and replace βn in by An to get γn-1. • Then find a handle An-1→βn-1 in γn-1, and replace βn-1 in by An-1 to get γn-2. • Repeat this, until we reach S. CS308 Compiler Theory 6
A Shift-Reduce Parser E->E+TT Right-Most Derivation of id+id*id T→T*F|F E→E+T→E+T*F→E+T*id→E+F*id F→(E)|id →E+id*id→T+id*id→F+id*id→id+id*id Right-Most Sentential Form Reducing Production id+id*id F→id F+id*id T→F T+id*id E→T E+id*id F→id E+F*id T→F E+T*id F→id E+T*E T→T*F E+T E>E+T E Handles are red and underlined in the right-sentential forms. CS308 Compiler Theory 7
A Shift-Reduce Parser E → E+T | T Right-Most Derivation of id+id*id T → T*F | F T*F | F E ⇒ E+T ⇒ E+T*F ⇒ E+T*id ⇒ E+F*id F → (E) | id ⇒ E+id*id ⇒ T+id*id ⇒ F+id*id ⇒ id+id*id Right-Most Sentential Form Reducing Production id+id*id F → id F+id id * T → F T+id*id E → T E+id*id F → id E+ F*id T → F E+T*id F → id E+ T F* F T → T F* F E+T E → E+T E d d d li d i h i h i lf CS308 Compiler Theory 7 Handles are re d an d un derline d in t he rig ht-sententia l forms
A Stack Implementation of A Shift-Reduce Parser There are four possible actions of a shift-parser action: 1.Shift:The next input symbol is shifted onto the top of the stack. 2.Reduce:Replace the handle on the top of the stack by the non- terminal. 3.Accept:Successful completion of parsing. 4.Error:Parser discovers a syntax error,and calls an error recovery routine. Initial stack just contains only the end-marker $ The end of the input string is marked by the end-marker $ CS308 Compiler Theory
A Stack Implementation of A Shift-Reduce Parser • There are four possible actions of a shift-parser action: 1. Shift : The next input symbol is shifted onto the top of the stack. 2. Reduce: Replace the handle on the top of the stack by the nonterminal. 3. Accept: Successful completion of parsing. 4. Error: Parser discovers a syntax error, and calls an error recovery rout ine. • Initial stack just contains only the end-marker $. • The end of the in put strin g is marked b y the en d-marker $. CS308 Compiler Theory 8 pg y
A Stack Implementation of A Shift-Reduce Parser Stack Input Action $ id+id*id$shift Sid +id*id$ reduce by F→id Parse Tree SF +id*id$ reduce by T→F ST +id*id$ reduce by E→T E8 SE +id*id$ shift $E+ id*id$ shift E 3 T7 SE+id *id$ reduce by F→id SE+F *id$ reduce by T→F SE+T *id$ shift $E+T* id$ shift 1 id SE+T*id $ reduce by F→id SE+T*F $ reduce by T→T*F id SE+T $ reduce by E→E+T SE $ accept CS308 Compiler Theory
A Stack Implementation of A Shift-Reduce Parser Stack Input Action $ id+id*id$ shift $id +id*id$ reduce by F → id Parse Tree $ F +id*id$ reduce by T → F $ T +id*id$ reduce by E → T E 8 $E +id*id$ shift $E+ id id$ * shift E 3 + T 7 $E+id *id$ reduce by F → id $E+ F *id$ reduce by T → F T 2 T 5 * F 6 $E+T *id$ shift $E+T* id$ shift F 1 F 4 id $E+T*id $ reduce b y F → id $E+T*F $ reduce by T → T*F id id $E+T $ reduce by E → E+T $E $ t CS308 Compiler Theory 9 accep t
Conflicts During Shift-Reduce Parsing There are context-free grammars for which shift-reduce parsers cannot be used. Stack contents and the next input symbol may not decide action: shift/reduce conflict:Whether make a shift operation or a reduction. -reduce/reduce conflict:The parser cannot decide which of several reductions to make. If a shift-reduce parser cannot be used for a grammar,that grammar is called as non-LR(k grammar. left to right right-most k lookhead scanning derivation An ambiguous grammar can never be a LR grammar. CS308 Compiler Theory 10
Conflicts During Shift-Reduce Parsing • There are context-free grammars for which shift-reduce parsers cannot b d e use d. • Stack contents and the next input symbol may not decide action: – shift/reduce conflict: Whether make a shift operation or a reduction. – reduce/reduce conflict: The parser cannot decide which of several red ti t k ductions to ma ke. • If a shift-reduce parser cannot be used for a grammar, that grammar is called as non called as non -LR(k) grammar LR(k) grammar. left to ri ght ri ght-most k lookhea d scanning derivation • An ambiguous grammar can never be a LR grammar CS308 Compiler Theory 10 An ambiguous grammar can never be a LR grammar