Semantic analysis Parsing only verifies that the program consists of tokens arranged in a syntactically-valid combination, we now move on to semantic analysis, where we delve deeper to check whether they form a sensible set of instructions in the programming language For a program to be semantically correct, all variables functions, classes, etc are properly defined expressions and variables are used in ways that respect the type system, access control isnt violated, and so on Semantic analysis is the next-to-last phase of the front end and the compilers last chance to weed out incorrect programs We need to ensure the program is well-formed enough to continue on to the next phase where we generate code
Semantic analysis Parsing only verifies that the program consists of tokens arranged in a syntactically-valid combination, we now move on to semantic analysis, where we delve deeper to check whether they form a sensible set of instructions in the programming language. For a program to be semantically correct, all variables, functions, classes, etc. are properly defined, expressions and variables are used in ways that respect the type system, access control isn’t violated, and so on. Semantic analysis is the next-to-last phase of the front end and the compiler’s last chance to weed out incorrect programs. We need to ensure the program is well-formed enough to continue on to the next phase where we generate code
semantic analysis and symbol table a large part of semantic analysis consists of tracking variable/function/type declarations. As we enter each new identifier in our symbol table. we need to record the type information of the declaration Then, as we continue parsing the rest of the program, we make sure that the type of each identifier and expression is respected in terms of the operations being performed
semantic analysis and symbol table A large part of semantic analysis consists of tracking variable/function/type declarations .As we enter each new identifier in our symbol table, we need to record the type information of the declaration. Then, as we continue parsing the rest of the program, we make sure that the type of each identifier and expression is respected in terms of the operations being performed
Examples of the things we check in the semantic analysis phase The type of the right-side expression of an assignment statement should match the type of the left-side, and the left-side needs to be a properly declared and assignable identifier (i.e. not some sort of constant) The parameters of a function should match the arguments of a function call in both number and type The language may require that identifiers are unique disallowing a global variable and function of the same name The operands to multiplication operation will need to be of numeric type, perhaps even the exact same type depending on the strictness of the language
Examples of the things we check in the semantic analysis phase. . The type of the right-side expression of an assignment statement should match the type of the left-side, and the left-side needs to be a properly declared and assignable identifier (i.e. not some sort of constant). The parameters of a function should match the arguments of a function call in both number and type. The language may require that identifiers are unique, disallowing a global variable and function of the same name. The operands to multiplication operation will need to be of numeric type, perhaps even the exact same type depending on the strictness of the language
As we encounter identifiers in a program, we need to determine if the identifier is accessible at that point in the program This is called scope checking One additional issue in semantic analysis is dealing with scopes A scope is a section of program text enclosed by basic program delimiters, e.g,() in C, or begin-end in Pascal. Many languages allow nested scopes that are scopes defined within other scopes. The scope defined by the innermost such unit is called the current scope. The scope defined by the current scope and by any enclosing brogram units are known as open scopes. any pi other scope is a closed scope
As we encounter identifiers in a program, we need to determine if the identifier is accessible at that point in the program. This is called scope checking One additional issue in semantic analysis is dealing with scopes. . A scope is a section of program text enclosed by basic program delimiters, e.g., {} in C, or begin-end in Pascal. Many languages allow nested scopes that are scopes defined within other scopes. The scope defined by the innermost such unit is called the current scope. The scope defined by the current scope and by any enclosing program units are known as open scopes. Any other scope is a closed scope
Syntax-directed translation refers to a method of compiler implementation where the source language translation is completely driven by the parser. In other words, the parsing process and parse trees are used to direct semantic analysis and the translation of the source program This can be a separate phase of a compiler or we can augment our conventional grammar with information to control the semantic analysis and translation. Such grammars are called attribute grammars
Syntax-directed translation refers to a method of compiler implementation where the source language translation is completely driven by the parser. In other words, the parsing process and parse trees are used to direct semantic analysis and the translation of the source program. This can be a separate phase of a compiler or we can augment our conventional grammar with information to control the semantic analysis and translation. Such grammars are called attribute grammars
attribute grammars We augment a grammar by associating attributes with each grammar symbol that describes its properties An attribute has a name and an associated value-a string, a number, a type, a memory location, an assigned register, whatever information we need With each production in a grammar, we give semantic rules or actions. which describe how to compute the attribute values associated with each grammar symbol in a production. The attribute value for a parse node may depend on information from its children nodes below or its siblings and parent node above
attribute grammars. We augment a grammar by associating attributes with each grammar symbol that describes its properties. An attribute has a name and an associated value— a string, a number, a type, a memory location, an assigned register, whatever information we need. With each production in a grammar, we give semantic rules or actions, which describe how to compute the attribute values associated with each grammar symbol in a production. The attribute value for a parse node may depend on information from its children nodes below or its siblings and parent node above
There are two types of attributes we might encounter: synthesized or inherited Synthesized attributes are those attributes that are passed up a parse tree, i.e., the left-side attribute is computed from the right-side attributes. values for the attributes of terminals are usually supplied by the lexical analyzer and the synthesized ones are passed up from there Inherited attributes are those that are passed down a parse tree, i. e, the right-side attributes are derived from the left-side attributes(or other right-side attributes) These attributes are used for passing information about the context to nodes further down the tree
There are two types of attributes we might encounter: synthesized or inherited . Synthesized attributes are those attributes that are passed up a parse tree, i.e., the left-side attribute is computed from the right-side attributes. Values for the attributes of terminals are usually supplied by the lexical analyzer and the synthesized ones are passed up from there . Inherited attributes are those that are passed down a parse tree, i.e., the right-side attributes are derived from the left-side attributes (or other right-side attributes). These attributes are used for passing information about the context to nodes further down the tree
S- attribute grammars, Only have synthesized attributes L- attribute grammars, for every production A→XX2…Xn, each attribute is a synthesized attribute or is a nherited attribute of X (1≤≤n), and the nherited attribute of X (1sisn) depends on the Inherited attribute of a or depends on the attributes ofX, x 2 in every semantics rule
S- attribute grammars,Only have synthesized attributes L- attribute grammars, for every production A→X1X2…Xn,each attribute is a synthesized attribute or is a Inherited attribute of Xj (1≤j≤n),and the Inherited attribute of Xj (1≤j≤n) depends on the Inherited attribute of A or depends on the attributes of X1,X2,…, Xj-1 in every semantics rule
We can implement syntax-directed translation in either a top-down or a bottom-up parser Access to attributes in yacc The syntax $1, $2 is used to access the attribute of the nth token on the right side of the production The global variable yylval is set by the scanner and that value is saved with the token when placed on the parse stack. When a rule is reduced, a new state is placed on the stack, the default behavior is to just copy the attribute of $1 for that new state, this can be controlled by assigning to $S in the action for the rule
We can implement syntax-directed translation in either a top-down or a bottom-up parser. Access to attributes in yacc . The syntax $1, $2 is used to access the attribute of the nth token on the right side of the production. The global variable yylval is set by the scanner and that value is saved with the token when placed on the parse stack. When a rule is reduced, a new state is placed on the stack, the default behavior is to just copy the attribute of $1 for that new state, this can be controlled by assigning to $$ in the action for the rule
intermediate representation Most compilers translate the source first to some form of intermediate representation and convert from there into machine code. The intermediate representation is a machine- and languageindependent version of the original source code. Although converting the code twice introduces another step and thus incurs loss in compiler efficiency, use of an intermediate representation provides advantages in increased abstraction. cleaner separation between the front and back ends, adds possibilities for re-targeting/cross- compilation, and works well with many advanced optimization techniques
intermediate representation Most compilers translate the source first to some form of intermediate representation and convert from there into machine code. The intermediate representation is a machine- and languageindependent version of the original source code. Although converting the code twice introduces another step and thus incurs loss in compiler efficiency, use of an intermediate representation provides advantages in increased abstraction, cleaner separation between the front and back ends, adds possibilities for re-targeting/crosscompilation, and works well with many advanced optimization techniques