Intermediate Code Generation CS308 Compiler Theory 1
Intermediate Code Generation CS308 Compiler Theory 1
Intermediate Code Generation Intermediate codes are machine independent codes,but they are close to machine instructions. ● The given program in a source language is converted to an equivalent program in an intermediate language by the intermediate code generator. Intermediate language can be many different languages,and the designer of the compiler decides this intermediate language. syntax trees can be used as an intermediate language 一 postfix notation can be used as an intermediate language. three-address code (Quadraples)can be used as an intermediate language we will use quadraples to discuss intermediate code generation quadraples are close to machine instructions,but they are not actual machine instructions. some programming languages have well defined intermediate languages java-java virtual machine prolog-warren abstract machine In fact,there are byte-code emulators to execute instructions in these intermediate languages CS308 Compiler Theory 2
Intermediate Code Generation • Intermediate codes are machine independent codes, but they are close to machine instr ctions to machine instructions. • The given program in a source language is converted to an equivalent program in an intermediate language by the intermediate equivalent program in an intermediate language by the intermediate code generator. • Intermediate language can be many different languages, and the designer of the compiler decides this intermediate language. – syntax trees can be used as an intermediate language. – p gg ostfix notation can be used as an intermediate lan gua ge. – three-address code (Quadraples) can be used as an intermediate language • we will use quadraples to discuss intermediate code generation • qp , y uadra ples are close to machine instructions, but the y are not actual machine instructions. – some programming languages have well defined intermediate languages. • java – java virtual machine • prolog – warren abstract machine CS308 Compiler Theory 2 • In fact, there are byte-code emulators to execute instructions in these intermediate languages
Three-Address Code (Quadraples) 。A quadraple is: x :y op z where x,y and z are names,constants or compiler-generated temporaries;op is any operator. But we may also the following notation for quadraples(much better notation because it looks like a machine code instruction) op yizIx apply operator op to y and z,and store the result in x. ·We use the term“three-address code”because each statement usually contains three addresses(two for operands,one for the result). CS308 Compiler Theory 3
Three-Address Code (Quadraples) • A quadraple is: x := y op z where x, y and z are names, constants or compiler-generated temporaries; op is any operator. • But we may also the following notation for quadraples (much better notation because it looks like a machine code instruction) op y,z,x apply operator op to y and z, and store the result in x. • We use the term “three-address code” because each statement usually t i th dd (t f d f th lt) CS308 Compiler Theory 3 con t a ins three addresses (two for operan ds, one for the result)
Three-Address Statements Binary Operator: op y,z,result or result :=y op z where op is a binary arithmetic or logical operator.This binary operator is applied to y and z,and the result of the operation is stored in result. Ex: add a,b,c gt a,b,c addr a,b,c addi a,b,c Unary Operator: op y,,result or result :op y where op is a unary arithmetic or logical operator.This unary operator is applied to y, and the result of the operation is stored in result. Ex: uminus ar c not arc inttoreal a,,c CS308 Compiler Theory 4
Three-Address Statements Binary Operator: op y,z,result or result := y op z w here op i bi ith ti l i l t Thi bi t i li d t is a binary arithmetic or logica l opera tor. This binary opera tor is appli e d to y and z, and the result of the operation is stored in result. Ex: add a,b,c gt a,b,c addr a,b,c addi a b c addi a,b,c Unary Operator: op y,,result or result := op y where op is a unary arithmetic or logical operator. This unary operator is applied to y, and the result of the operation is stored in result. Ex: uminus a,,c uminus a,,c not a,,c inttoreal a,,c CS308 Compiler Theory 4
Three-Address Statements (cont.) Move Operator: mov y,,result or result :=y where the content of y is copied into result. Ex: mov arc movi arC movr ar c Unconditional Jumps:jmp ,,L or gotoL We will jump to the three-address code with the label L,and the execution continues from that statement. Ex: jmp ,L1 /∥jump to L1 jmp ,,7 /jump to the statement 7 CS308 Compiler Theory 5
Three-Address Statements (cont.) Move Operator: mov y,,result or result := y where the content of y is copied into result. Ex: mov a,,c movi a c movi a,, c movr a,,c Unconditional Jumps: jmp ,,L or goto L We will jump to the three-address code with the label L, and the execution continues from that statement. Ex: jmp ,,L1 // jump to L1 jmp 7 ,, 7 // jump to the statement 7 // jump to the statement 7 CS308 Compiler Theory 5
Three-Address Statements (cont.) Conditional Jumps:jmprelop y,z,L or if y relop z goto L We will jump to the three-address code with the label L if the result ofy relop z is true,and the execution continues from that statement.If the result is false,the execution continues from the statement following this conditional jump statement. Ex: jmpgt y,Z,L1 //jump to Ll if y>z jmpgte y,Z,L1 /jump to Ll ify>=z jmpe y,z,L1 /jump to Ll ify=-z jmpne y,Z,L1 /jump to Ll if y!=z Our relational operator can also be a unary operator. jmpnz Y,,L1 /jump to Ll ify is not zero jmpz y,,L1 /jump to Ll ify is zero jmpt y,,L1 /jump to Ll ify is true jmpf y,,L1 /jump to Ll ify is false CS308 Compiler Theory 6
Three-Address Statements (cont.) Conditional Jumps: jmprelop y,z,L or if y relop z goto L We will jump to the three-address code with the label L if the result of y relop z is true, and the execution continues from that statement. If the result is false, the execution continues from the statement followin g jp this conditional jum p statement. Ex: jmpgt y,z,L1 // jump to L1 if y>z jmpgte y,z,L1 // jump to L1 if y>=z jmpe y,z,L1 // jump to L1 if y==z jmpne y,z,L1 // jump to L1 if y!=z Our relational operator can also be a unary operator. jmpnz y,,L1 // jump to L1 if y is not zero j L1 mpz y,,L1 // j t L1 if i // jump to L1 if y is zero jmpt y,,L1 // jump to L1 if y is true jmpf y,,L1 // jump to L1 if y is false CS308 Compiler Theory 6 jmpf y,,L1 // jump to L1 if y is false
Three-Address Statements (cont.) Procedure Parameters: param x,,or param x Procedure Calls: call p,n,or call p,n where x is an actual parameter,we invoke the procedure p with n parameters. Ex: param x param ¥21I →p(x1,·.·,xm) param XnlI call pin, f(x+1,y)→ add x,1,t1 param t1,, param y, call f,2, CS308 Compiler Theory 7
Three-Address Statements (cont.) Procedure Parameters: param x,, or param x Procedure Calls: call p,n, or call p,n where x is an actual parameter, we invoke the procedure p with n parameters. Ex: param x 1,, param x 2,, Î p(x 1,...,x n) param x n,, call p,n, f(x+1,y) Î add x,1,t1 param t1,, param y,, call f,2, CS308 Compiler Theory 7 call f,2
Three-Address Statements (cont.) Indexed Assignments: move y[i],x or x :=y[i] move x,,y[i] or y[i]:x Address and Pointer Assignments: moveaddr y,x or X:= &y movecont y,,x or x :*y CS308 Compiler Theory 8
Three-Address Statements (cont.) Indexed Assignments: move y[i],,x or x := y[i] move x,,y[i] or y[i] := x Address and Pointer Assignments: moveaddr y,,x or x := &y movecont y,,x or x := *y CS308 Compiler Theory 8
Syntax-Directed Translation into Three-Address Code S→id:=E S.code=E.code ll gen('mov'E.placeid.place) E→E1+E2 E.place newtemp(); E.code=Ei.code‖E2.code‖gen(add'E1.place‘,'E,place‘,'E.place) E→E1*E2 E.place newtemp(); E.code=Ecodel E2.codell gen(mult'Eplace,E2-place,E.place) E→-E1 E.place newtemp(); E.code=E1.code ll gen('uminus'E1.place,'E.place) E→(E1) E.place=E1.place; E.code=E.code E→id E.place=id.place; E.code null CS308 Compiler Theory
Syntax-Directed Translation into Three-Address Code S → id := E S.code = E.code || gen(‘mov’ E.place ‘,,’ id.place) E → E E E l () 1 + E 2 E.p lace = newtemp(); E.code = E1.code || E 2.code || gen(‘add’ E1.place ‘,’ E 2.place ‘,’ E.place) E → E1 * E 2 E place = newtemp(); 1 E 2 E.place newtemp(); E.code = E1.code || E 2.code || gen(‘mult’ E1.place ‘,’ E 2.place ‘,’ E.place) E → - E1 E.place = newtemp(); E.code = E1.code || gen(‘uminus’ E1.place ‘,,’ E.place) E → ( E1 ) E.place = E1.place; Ed E .co de = E1.co d e E → id E.place = id.place; E code . = null CS308 Compiler Theory 9
Syntax-Directed Translation (cont.) S-→while E do S, S.begin newlabelO); S.after newlabelO; S.code=gen(S.begin“:")‖E.code ll gen(‘jmpf E.place‘,'S.after)‖S1.code‖ gen(jmp',,'S.begin)ll gen(S.after :" S->if E then S else S2 S.else=newlabel(); S.after newlabel(); S.code=E.code‖ gen(jmpf E.place'S.else)ll S1.codell gen(jmp'‘,'S.after)‖ gen(S.else")Il S2.codell gen(S.after :" CS308 Compiler Theory 10
Syntax-Directed Translation (cont.) S → while E do S1 S.begin = newlabel(); S f l b l() S.a fter = new l a b el(); S.code = gen(S.begin “:”) || E.code || gen(‘jmpf’ E place ‘ ’ S after) || S1 gen( jmpf E.place ,, S.after) || S code || 1.code || gen(‘jmp’ ‘,,’ S.begin) || gen(S.after ‘:”) S → if E then S1 else S 2 S.else = newlabel(); S.after = newlabel(); S d E d || S.co de = E.co de || gen(‘jmpf’ E.place ‘,,’ S.else) || S1.code || gen( jmp ‘ ’ ‘,,’ S after) || S.after) || gen(S.else ‘:”) || S 2.code || gen(S.after ‘:”) CS308 Compiler Theory 10