CS308 Compiler Theory CS308 Compiler Theory
CS308 Compiler Theor y CS308 Compiler Theory 1
Syntax-Directed Translation Grammar symbols are associated with attributes to associate information with the programming language constructs that they represent. Values of these attributes are evaluated by the semantic rules associated with the production rules. Evaluation of these semantic rules: -may generate intermediate codes may put information into the symbol table may perform type checking may issue error messages may perform some other activities -in fact,they may perform almost any activities. An attribute may hold almost any thing. a string,a number,a memory location,a complex record. CS308 Compiler Theory 2
Syntax-Directed Translation • Grammar symbols are associated with attributes to associate information ith the programming lang age constr cts that the information with the programming lang uage constructs that they represent. • Values of these attributes are evaluated by the Values of these attributes are evaluated by the semantic rules semantic rules associated with the production rules. • Evaluation of these semantic rules: – may generate intermediate codes – may put information into the symbol table – ma y p yp g erform type checkin g – may issue error messages – may perform some other activities – in fact, they may perform almost any activities. ivities. • An attribute may hold almost any thing. – a string, a number, a memory location, a complex record. CS308 Compiler Theory 2
Syntax-Directed Definitions and Translation Schemes When we associate semantic rules with productions,we use two notations: Syntax-Directed Definitions Translation Schemes 。 Syntax-Directed Definitions: give high-level specifications for translations hide many implementation details such as order of evaluation of semantic actions. We associate a production rule with a set of semantic actions,and we do not say when they will be evaluated. Translation Schemes: indicate the order of evaluation of semantic actions associated with a production rule. In other words,translation schemes give a little bit information about implementation details. CS308 Compiler Theory 3
Syntax-Directed Definitions and Translation Schemes • When we associate semantic rules with productions, we use two notat ions: – Syntax-Directed Definitions – Translation Schemes • Syntax-Directed Definitions: – give high-level specifications for translations – hide many implementation details such as order of evaluation of semantic actions. – We associate a p yy roduction rule with a set of semantic actions, and we do not sa y when the y will be evaluated. • Translation Schemes: – i di h d f l i f i i i d i h d i l indicate t he or der o f evaluation o f semantic actions associate d wit h a pro duction rule. – In other words, translation schemes give a little bit information about implementation details. CS308 Compiler Theory 3
Syntax-Directed Definitions A syntax-directed definition is a generalization of a context-free grammar in which: -Each grammar symbol is associated with a set of attributes. This set of attributes for a grammar symbol is partitioned into two subsets called synthesized and inherited attributes of that grammar symbol. Each production rule is associated with a set of semantic rules. Semantic rules set up dependencies between attributes which can be represented by a dependency graph. This dependency graph determines the evaluation order of these semantic rules. Evaluation of a semantic rule defines the value of an attribute.But a semantic rule may also have some side effects such as printing a value. CS308 Compiler Theory
Syntax-Directed Definitions • A syntax-directed definition is a generalization of a context-free grammar i hi h n whi c h: – Each grammar symbol is associated with a set of attributes. – This set of attributes for a grammar symbol is partitioned into two subsets called This set of attributes for a grammar symbol is partitioned into two subsets called synthesized and inherited attributes of that grammar symbol. – Each production rule is associated with a set of semantic rules. • Semantic rules set up dependencies between attributes which can be represented by a dependency graph. • This dependency graph determines the evaluation order of these semantic rules. • Evaluation of a semantic rule defines the value of an attribute. But a semantic rule may also have some side effects such as printing a value. CS308 Compiler Theory 4
Syntax-Directed Definition --Example Production Semantic Rules L→E return print(E.val) E→E1+T E.val E.val T.val E→T E.val T.val T→T1*F T.val T val F.val T→F T.val F.val F→(E) F.val E.val F→digit F.val digit.lexval Symbols E,T,and F are associated with a synthesized attribute val ● The token digit has a synthesized attribute lexval(it is assumed that it is evaluated by the lexical analyzer). CS308 Compiler Theory 5
Syntax-Directed Definition -- Example Production Semantic Rules L → E return print(E.val) E → E1 + T E.val = E1.val + T.val E → T E.val = T.val T → T1 * F T.val = T1.val * F.val T → F T lF l .va l = F.va l F → ( E ) F.val = E.val F → digit F val = F.val = digit.lexval • Symbols E T and F are associated with a synthesized attribute Symbols E, T, and F are associated with a synthesized attribute val. • The token digit has a synthesized attribute lexval (it is assumed that it is evaluated by the lexical analyzer). CS308 Compiler Theory 5
Translation Schemes In a syntax-directed definition,we do not say anything about the evaluation times of the semantic rules (when the semantic rules associated with a production should be evaluated?). .A translation scheme is a context-free grammar in which: -attributes are associated with the grammar symbols and semantic actions enclosed between braces are inserted within the right sides of productions. ·Ex: A→{}X{…}Y{…} Semantic Actions CS308 Compiler Theory 6
Translation Schemes • In a syntax-directed definition, we do not say anything about the eval i i fh i l ( h h i l luat ion t imes o f t he semant ic ru les ( w hen t he semant ic ru les associated with a production should be evaluated?). • A translation scheme is a context-free grammar in which: – att ib t i t d ith th b l d tt rib u tes are assoc i a t e d with the grammar sym b o ls an d – semantic actions enclosed between braces {} are inserted within the right sides of productions the right sides of productions. • Ex: A → { }X{ }Y{ } ... } X { ... } Y { ... } S ti A ti CS308 Compiler Theory 6 Semantic A ctions
A Translation Scheme Example A simple translation scheme that converts infix expressions to the corresponding postfix expressions. E→TR R→+T{print((+")}R1 R→E T-id print(id.name)} a+b+c>ab+c+ infix expression postfix expression CS308 Compiler Theory 7
A Translation Scheme Example • A simple translation scheme that converts infix expressions to the correspondi fi i ding postfix express ions. E → T R R → + T { print(“+”) } R1 R → ε T → id { print(id.name) } a+b+c Î ab+c+ infix expression postfix expression CS308 Compiler Theory 7
Type Checking A compiler has to do semantic checks in addition to syntactic checks ● Semantic Checks Static-done during compilation -Dynamic-done during run-time Type checking is one of these static checking operations we may not do all type checking at compile-time -Some systems also use dynamic type checking too. A type system is a collection of rules for assigning type expressions to the parts of a program. A type checker implements a type system. A sound type system eliminates run-time type checking for type errors. A programming language is strongly-typed,if every program its compiler accepts will execute without type errors. In practice,some of type checking operations are done at run-time(so,most of the programming languages are not strongly-typed). Ex:int x[100];...x[i]>most of the compilers cannot guarantee that i will be between 0 and 99 CS308 Compiler Theory 8
Type Checking • A compiler has to do semantic checks in addition to syntactic checks. • Semantic Checks Semantic Checks – Static – done during compilation – Dynamic – done during run-time • T h ki ype checking is one of these static checking operations is one of these static checking operations. – we may not do all type checking at compile-time. – Some systems also use dynamic type checking too. • A t t ype system i ll ti f l f i i t i t th t f is a collection of rules for assigning type expressions to the parts of a program. • A type checker implements a type system. • A sound type system eliminates run-time type checking for type errors. • A programming language is strongly-typed, if every program its compiler accepts will execute without type errors. – In practice, some of type checking operations are done at run-time (so, most of the programming languages are not strongly-typed). – Ex: int x[100]; … x[i] Î most of the compilers cannot guarantee that i will be between 0 and 99 CS308 Compiler Theory 8
Intermediate Code Generation Intermediate codes are machine independent codes,but they are close to machine instructions. ● The given program in a source language is converted to an equivalent program in an intermediate language by the intermediate code generator. Intermediate language can be many different languages,and the designer of the compiler decides this intermediate language. syntax trees can be used as an intermediate language 一 postfix notation can be used as an intermediate language. three-address code (Quadraples)can be used as an intermediate language we will use quadraples to discuss intermediate code generation quadraples are close to machine instructions,but they are not actual machine instructions. some programming languages have well defined intermediate languages java-java virtual machine prolog-warren abstract machine In fact,there are byte-code emulators to execute instructions in these intermediate languages CS308 Compiler Theory
Intermediate Code Generation • Intermediate codes are machine independent codes, but they are close to machine instr ctions to machine instructions. • The given program in a source language is converted to an equivalent program in an intermediate language by the intermediate equivalent program in an intermediate language by the intermediate code generator. • Intermediate language can be many different languages, and the designer of the compiler decides this intermediate language. – syntax trees can be used as an intermediate language. – p gg ostfix notation can be used as an intermediate lan gua ge. – three-address code (Quadraples) can be used as an intermediate language • we will use quadraples to discuss intermediate code generation • qp , y uadra ples are close to machine instructions, but the y are not actual machine instructions. – some programming languages have well defined intermediate languages. • java – java virtual machine • prolog – warren abstract machine CS308 Compiler Theory 9 • In fact, there are byte-code emulators to execute instructions in these intermediate languages
Three-Address Code (Quadraples) 。A quadraple is: x :y op z where x,y and z are names,constants or compiler-generated temporaries;op is any operator. But we may also the following notation for quadraples(much better notation because it looks like a machine code instruction) op yizIx apply operator op to y and z,and store the result in x. ·We use the term“three-address code”because each statement usually contains three addresses(two for operands,one for the result). CS308 Compiler Theory 10
Three-Address Code (Quadraples) • A quadraple is: x := y op z where x, y and z are names, constants or compiler-generated temporaries; op is any operator. • But we may also the following notation for quadraples (much better notation because it looks like a machine code instruction) op y,z,x apply operator op to y and z, and store the result in x. • We use the term “three-address code” because each statement usually t i th dd (t f d f th lt) CS308 Compiler Theory 10 con t a ins three addresses (two for operan ds, one for the result)