COMPILER CONSTRUCTION Principles and practice Kenneth C. louden
COMPILER CONSTRUCTION Principles and Practice Kenneth C. Louden
3. Context-Free Grammars and Parsing PART ONE
3. Context-Free Grammars and Parsing PART ONE
Contents PART ONE 3. 1 The Parsing Process More 3.2 Context-Free Grammars More 3.3 Parse Trees and abstract More 3.4 Ambiguity More PART TWO 3.5 Extended Notations EBNF and Syntax diagrams 3.6 Formal Properties of Context-Free Languages 3. 7 Syntax of the tINY Language
Contents PART ONE 3.1 The Parsing Process [More] 3.2 Context-Free Grammars [More] 3.3 Parse Trees and Abstract [More] 3.4 Ambiguity [More] PART TWO 3.5 Extended Notations: EBNF and Syntax Diagrams 3.6 Formal Properties of Context-Free Languages 3.7 Syntax of the TINY Language
Introduction Parsing is the task of Syntax Analysis Determining the syntax, or structure, of a program The syntax is defined by the grammar rules of a context-Free grammar The rules of a context-free grammar are recursive The basic data structure of Syntax Analysis Is parse tree or syntax tree The syntactic structure of a language must also be recursive
Introduction • Parsing is the task of Syntax Analysis • Determining the syntax, or structure, of a program. • The syntax is defined by the grammar rules of a Context-Free Grammar • The rules of a context-free grammar are recursive • The basic data structure of Syntax Analysis is parse tree or syntax tree • The syntactic structure of a language must also be recursive
3.1 The Parsing Process
3.1 The Parsing Process
Function of a parser Takes the sequence of tokens produced by the scanner as its input and produces the syntax tree as its output arser Sequence of tokens Syntax-Tree
Function of a Parser • Takes the sequence of tokens produced by the scanner as its input and produces the syntax tree as its output. Parser • Sequence of tokens Syntax-Tree
Issues of the Parsing The sequence of tokens is not an explicit input parameter The parser calls a scanner procedure getToken to fetch the next token from the input as it is needed during the parsing process The parsing step of the compiler reduces to a call to the parser as follows: SyntaxTree= parse()
Issues of the Parsing • The sequence of tokens is not an explicit input parameter – The parser calls a scanner procedure getToken to fetch the next token from the input as it is needed during the parsing process. – The parsing step of the compiler reduces to a call to the parser as follows: SyntaxTree = parse( )
Issues of the Parsing The parser incorporate all the other phases of a compiler in a single-pass compiler No explicit syntax tree needs to be constructed The parser steps themselves will represent the syntax tree implicitly by a call Parse(
Issues of the Parsing • The parser incorporate all the other phases of a compiler in a single-pass compiler • No explicit syntax tree needs to be constructed • The parser steps themselves will represent the syntax tree implicitly by a call Parse ( )
Issues of the Parsing In Multi-Pass, the further passes will use the syntax tree as their input The structure of the syntax tree is heavily dependent on the particular syntactic structure of the language This tree is usually defined as a dynamic data structure Each node consists of a record whose fields include the attributes needed for the remainder of the compilation process (i.e, not just those computed by the parser)
Issues of the Parsing • In Multi-Pass, the further passes will use the syntax tree as their input – The structure of the syntax tree is heavily dependent on the particular syntactic structure of the language – This tree is usually defined as a dynamic data structure – Each node consists of a record whose fields include the attributes needed for the remainder of the compilation process (i.e., not just those computed by the parser)
Issues of the Parsing What is more difficult for the parser than the scanner is the treatment of errors Error in the scanner Generate an error token and consume the offending character
Issues of the Parsing • What is more difficult for the parser than the scanner is the treatment of errors. • Error in the scanner – Generate an error token and consume the offending character