• Have any questions?
• info.zbook.org@gmail.com

Operator Precedence Parsing - MONTEFIORE

9d ago
14 Views
4.44 MB
26 Pages
Last View : 1d ago
Share:
Transcription

Operator precedence parsingBottom-up parsing methods that follow the idea of shift-reduceparsersSeveral flavors: operator, simple, and weak precedence.In this course, only weak precedenceMain di erences with respect to LR parsers:IIISyntax analysisThere is no explicit state associated to the parser (and thus no statepushed on the stack)The decision of whether to shift or reduce is taken based solely on thesymbol on the top of the stack and the next input symbol (and storedin a shift-reduce table)In case of reduction, the handle is the longest sequence at the top ofstack matching the RHS of a rule194

Structure of the weak precedence parserinputa1aian stackXm1XmWeak precedence parsingoutputX2X1Shift-reduce tableterminals,nonterminals and terminals and (Ã modifier)Syntax analysisShift/Reduce/Error195

Weak precedence parsing algorithmCreate a stack with the special symbol a getnexttoken()while (True)if (Stack S and a )break // Parsing is overXm top(Stack)if (SRT [Xm , a] shift)Push a onto the stacka getnexttoken()elseif (SRT [Xm , a] reduce)Search for the longest RHS that matches the top of the stackif no match foundcall error-recovery routineLet denote this rule by Y ! Xm r 1 . . . XmPop r elements o the stackPush Y onto the stackOutput Y ! Xm r 1 . . . Xmelse call error-recovery routineSyntax analysis196

Example for the expression grammarExample:Shift/reduce tableEETTFF!E T!T!T F!F! (E )! idSyntax analysisETF ()id SR SRR()SRRSSSRRRRidSSSRRS RRRRRS197

Example of parsingStack id F T E E E id E F E T E T E T id E T F E T ESyntax analysisid id id id id ididInput id id id id id id id id id id eShiftShiftReduceReduceReduceAcceptby F ! idby T ! Fby E ! Tby F ! idby T ! Fby F ! idby T ! T Fby E ! E T198

Precedence relation: principleWe define the (weak precedence) relations l and m betweensymbols of the grammar (terminals or nonterminals)IIX l Y if XY appears in the RHS of a rule or if X precedes areducible word whose leftmost symbol is YX m Y if X is the rightmost symbol of a reducible word and Y thesymbol immediately following that wordShift when Xm l a, reduce when Xm m aReducing changes the precedence relation only at the top of thestack (there is thus no need to shift backward)Syntax analysis199

Precedence relation: formal definitionLet G (V , , R, S) be a context-free grammar and a newsymbol acting as left and right end-marker for the input word.Define V 0 V [ { }The weak precedence relations l and m are defined respectively onV 0 V and V V 0 as follows: 1. X l Y if A ! XB2. X l Y if A ! XY 3. l X if S ) X 4. X m a if A ! B 5. X m if S ) Xis in R, and B ) Y ,is in R is in R, and B ) X and )afor some , , , and BSyntax analysis200

Construction of the SR table: shiftShift relation, l:Initialize S to the empty set.add l S to Sfor each production X ! L1 L2 . . . Lkfor i 1 to k 1add Li l Li 1 to S3 repeatfor each pair X l Y in Sfor each production Y ! L1 L2 . . . LkAdd X l L1 to Suntil S did not change in this iteration.12 We only need to consider the pairs X l Y with Y a nonterminal that were added inS at the previous iterationSyntax analysis201

Example of the expression grammar: shiftStep 1Step 2EETTFF!E T!T!T F!F! (E )! idStep 3.1Step 3.2Step 3.3Syntax analysisSl E l lTT l lF(lEE l) lF l id l((lT l id l((lF(l((lid202

Construction of the SR table: reduceReduce relation, m:Initialize R to the empty set.add S m to Rfor each production X ! L1 L2 . . . Lkfor each pair X l Y in Sadd Lk m Y in R3 repeatfor each pair X m Y in Rfor each production X ! L1 L2 . . . LkAdd Lk m Y to Runtil R did not change in this iteration.12 We only need to consider the pairs X m Y with X a nonterminal that were added inR at the previous iteration.Syntax analysis203

Example of the expression grammar: reduceStep 1Step 2EETTFF!E T!T!T F!F! (E )! idStep 3.1Step 3.2Step 3.3Syntax analysisE m T m F m T m)T m F m id m )m F m)F m id m )m )m)id m )m 204

Weak precedence grammarsWeak precedence grammars are those that can be analysed by aweak precedence parser.A grammar G (V , , R, S) is called a weak precedence grammarif it satisfies the following conditions:1. There exist no pair of productions with the same right hand side2. There are no empty right hand sides (A ! )3. There is at most one weak precedence relation between any twosymbols4. Whenever there are two syntactic rules of the form A ! X andB ! , we don’t have X l BConditions 1 and 2 are easy to checkConditions 3 and 4 can be checked by constructing the SR table.Syntax analysis205

Example of the expression grammarShift/reduce tableEETTFF!E T!T!T F!F! (E )! idETF ()id SR SRR()SRRSSSRRRRidSSSRRS RRRRRSConditions 1-3 are satisfied (there is no conflict in the SR table)Condition 4:IISyntax analysisE ! E T and E ! T but we don’t have l E (see slide 202)T ! T F and T ! F but we don’t have l T (see slide 202)206

Removing rulesRemoving rules of the form A ! is not difficultFor each rule with A in the RHS, add a set of new rules consistingof the di erent combinations of A replaced or not with .Example:S! AbA BB ! b cA ! is transformed intoS! AbA Ab bA b BB ! b cSyntax analysis207

Summary of weak precedence parsingConstruction of a weak precedence parserEliminate ambiguity (or not, see later)Eliminate productions with and ensure that there are no twoproductions with identical RHSConstruct the shift/reduce tableCheck that there are no conflict during the constructionCheck condition 4 of slide 205Syntax analysis208

Using ambiguous grammars with bottom-up parsersAll grammars used in the construction of Shift/Reduce parsingtables must be un-ambiguousWe can still create a parsing table for an ambiguous grammar butthere will be conflictsWe can often resolve these conflicts in favor of one of the choices todisambiguate the grammarWhy use an ambiguous grammar?IIBecause the ambiguous grammar is much more natural and thecorresponding unambiguous one can be very complexUsing an ambiguous grammar may eliminate unnecessary reductionsExample:E ! E E E E (E ) idSyntax analysis)E ! E T TT ! T F FF ! (E ) id209

Set of LR(0) items of the ambiguous expression grammarE ! E E E E (E ) idFollow (E ) { , , , )}) states 7 and 8 haveshift/reduce conflicts for and .(Dragonbook)Syntax analysis210

DisambiguationExample:Parsing of id id id will give the configuration(0E 1 4E 7, id )We can choose:IIACTION[7, ] shift ) precedence to ACTION[7, ] reduce E ! E E ) precedence to Parsing of id id id will give the configuration(0E 1 4E 7, id )We can choose:IIACTION[7, ] shift ) is right-associativeACTION[7, ] reduce E ! E E ) is left-associative(same analysis for I8 )Syntax analysis211

Error detection and recoveryIn table-driven parsers, there is an error as soon as the tablecontains no entry (or an error entry) for the current stack (state)and input symbolsThe least one can do: report a syntax error and give informationabout the position in the input file and the tokens that wereexpected at that positionIn practice, it is however desirable to continue parsing to reportmore errorsThere are several ways to recover from an error:IIIISyntax analysisPanic modePhrase-level recoveryIntroduce specific productions for errorsGlobal error repair212

Panic-mode recoveryIn case of syntax error within a “phrase”, skip until the nextsynchronizing token is found (e.g., semicolon, right parenthesis) andthen resume parsingIn LR parsing:IIISyntax analysisScan down the stack until a state s with a goto on a particularnonterminal A is foundDiscard zero or more input symbols until a symbol a is found that canfollow AStack the state GOTO(s, A) and resume normal parsing213

Phrase-level recoveryExamine each error entry in the parsing table and decide on anappropriate recovery procedure based on the most likely programmererror.Examples in LR parsing: E ! E E E E (E ) idIISyntax analysisid id: is unexpected after a : report a “missing operand” error, push anarbitrary number on the stack and go to the appropriate next stateid id) id:Report a “unbalanced right parenthesis” error and remove the rightparenthesis from the input214

Other error recovery approachesIntroduce specific productions for detecting errors:Add rules in the grammar to detect common errorsExamples for a C compiler:I ! if E I (parenthesis are missing around the expression)I ! if (E ) then I (then is not needed in C)Global error repair:Try to find globally the smallest set of insertions and deletions thatwould turn the program into a syntactically correct stringVery costly and not always e ectiveSyntax analysis215

Building the syntax treeParsing algorithms presented so far only check that the program issyntactically correctIn practice, the parser needs also to build the parse tree (also calledconcrete syntax tree)Its construction is easily embedded into the parsing algorithmTop-down parsing:IISyntax analysisRecursive descent: let each parsing function return the sub-trees forthe parts of the input they parseTable-driven: each nonterminal on the stack points to its node in thepartially built syntax tree. When the nonterminal is replaced by oneof its RHS, nodes for the symbols on the RHS are added as childrento the nonterminal node216

in which tokens are groupedea oftenrepresentedinnamea parsetoken suchas id,1 . Theid is short for identifier. The value 1 isBuilding the syntax treeymbol table produced by the compiler. This table is used to passhe token . In reality it is probably mapped to a pair, whose secondhat there are many different identifiers so we need the second component,mbol .I Each stack element points to a subtree of the syntax treen id,2 en .right.I When performing a reduce, a new syntax tree is built withg and is discussed furtherin subsequent chapters. It is mapped toe something. On the onehand there is onlyatone 3 so rootwe couldjust usenonterminalandthethepopped-o stack elementsammarcontainingrulesas bethecan be a difference between how suchthis shouldprinted (e.g., in an errorhases) and how it should be stored (fixed vs. float vs double). Perhaps thele where an entry for "this kind of 3" is stored. Another possibility is toBottom-up parsing: ; .theas childrenNote:IIn practice, the concrete syntax tree is not built but rather anIn C, most blanks are non-significant.rlly removed during scanning.simplifiedabstract syntax treeIDepending on the complexity of the compiler, the syntax tree mightrs, and the various symbols and punctuation without using ssion (expr). Notedecomposition in the figure on the right.ng)parsing is somewhat arbitrary, but invariably if a recursive definition is involved,g.ch tokens are groupedrepresented in a parsed the syntax tree with operators as interior nodes andrator. The syntax tree on the right corresponds to the parseepresentsansuchassignmentexpression not an assignment statement. In C ancontaining rulesasrailingsemicolon.Thatis,in C (unlike in Algol) the semicolon is a statementSyntax analysis217

Conclusion: top-down versus bottom-up parsingTop-downIIIEasier to implement (recursively), enough for most standardprogramming languagesNeed to modify the grammar sometimes strongly, less general thanbottom-up parsersUsed in most hand-written compilersBottom-up:IIISyntax analysisMore general, less strict rules on the grammar, SLR(1) powerfulenough for most standard programming languagesMore difficult to implement, less easy to maintain (add new rules,etc.)Used in most parser generators like Yacc or Bison (but JavaCC istop-down)218

For your projectThe choice of a parsing technique is left open for the project but weask you to implement the parser by yourself (Yacc, bison or otherparser generators are forbidden)Weak precedence parsing was the recommended method in previousimplementations of this courseMotivate your choice in your report and explain any transformationyou had to apply to your grammar to make it fit the parser’sconstraintsTo avoid mistakes, you should build the parsing tables by programSyntax analysis219

then resume parsing In LR parsing: I Scan down the stack until a state s with a goto on a particular nonterminal A is found I Discard zero or more input symbols until a symbol a is found that can follow A I Stack the state GOTO(s,A)andresumenormalparsing Syntax analysis 213