Lexical Analysis (2)
Sukree Sinthupinyo 1
1
Department of Computer Engineering Chulalongkorn University
14 July 2012
Outline
1 Recognition of Tokens Transition Diagrams
Completion of the Running Example
Architecture of a Transition-Diagram-Based Lexical Analyzer
Recognizing a keyword
2 Lexical-Analyzer Generator Lex Use of Lex
Structure of Lex
Lex Demonstration
Example
We will use this grammar as our first example
Example
Tokens, Patterns, and Attribute Values
Outline
1 Recognition of Tokens Transition Diagrams
Completion of the Running Example
Architecture of a Transition-Diagram-Based Lexical Analyzer
Recognizing a keyword
2 Lexical-Analyzer Generator Lex Use of Lex
Structure of Lex
Lex Demonstration
Transition Diagrams
We’ll convert the regular-expression to transition diagrams Nodes represent states.
Edges represent methods of how to change the state.
If we find an edge labeled a connecting from current state
to another state and the next symbol is a, we will move to
such state.
Transition Diagrams (cont.)
Conventions about transition diagrams
Accepting or final state indicates that a lexeme has been found. It’s represented by a double circle.
If it is necessary to retract the forward pointer one position, then we shall place a * near that accepting state.
There is one start state, or initial state.
Transition Diagram for relop
Recognition of Reserved Words and Identifiers
How to recognize keywords, such as if, then and, else.
Install the reserved words in the symbol table initially. Any new identifier will be an id.
Create separate transition diagrams for each keyword.
Outline
1 Recognition of Tokens Transition Diagrams
Completion of the Running Example
Architecture of a Transition-Diagram-Based Lexical Analyzer
Recognizing a keyword
2 Lexical-Analyzer Generator Lex Use of Lex
Structure of Lex
Lex Demonstration
Completion of the Running Example
A transition diagram for unsigned numbers.
A transition diagram for whitespace.
Outline
1 Recognition of Tokens Transition Diagrams
Completion of the Running Example
Architecture of a Transition-Diagram-Based Lexical Analyzer
Recognizing a keyword
2 Lexical-Analyzer Generator Lex Use of Lex
Structure of Lex
Lex Demonstration
Architecture of a Transition-Diagram-Based Lexical Analyzer
Example Code for relop
Architecture of a Transition-Diagram-Based Lexical
Analyzer (cont.)
Architecture of a Transition-Diagram-Based Lexical
Analyzer (cont.)
Architecture of a Transition-Diagram-Based Lexical Analyzer (cont.)
Try sequentially. The function fail() will reset the pointer forward and starts the next transition diagram.
Run in parallel. Feed the next input to all of them and allowing each one to make whatever transition it required.
Combine all the transition diagrams into one.
We combine states 0, 9, 12, and 22 into one start state.
Outline
1 Recognition of Tokens Transition Diagrams
Completion of the Running Example
Architecture of a Transition-Diagram-Based Lexical Analyzer
Recognizing a keyword
2 Lexical-Analyzer Generator Lex Use of Lex
Structure of Lex
Lex Demonstration
KMP Algorithm
Knuth, Morris, and Pratt algorithm
Recognize a single keyword b
1b
2. . . b
nin a text string.
For example, the diagram for the keyword ababaa is
We must find the failure function, f (s) computed as follow.
Failure Function
ababaa
Failure Function cont.
To find whether a keyword b 1 b 2 . . . b n is a substring of a
string a 1 a 2 . . . a m , we use the following algorithm.
Outline
1 Recognition of Tokens Transition Diagrams
Completion of the Running Example
Architecture of a Transition-Diagram-Based Lexical Analyzer
Recognizing a keyword
2 Lexical-Analyzer Generator Lex Use of Lex
Structure of Lex
Lex Demonstration
Use of Lex
Process of Lex
Outline
1 Recognition of Tokens Transition Diagrams
Completion of the Running Example
Architecture of a Transition-Diagram-Based Lexical Analyzer
Recognizing a keyword
2 Lexical-Analyzer Generator Lex Use of Lex
Structure of Lex
Lex Demonstration
Form of Lex
declarations
%%
translation rules
%%
auxiliary functions
Declarations
Translations Rules
Auxiliary Functions
Outline
1 Recognition of Tokens Transition Diagrams
Completion of the Running Example
Architecture of a Transition-Diagram-Based Lexical Analyzer
Recognizing a keyword
2 Lexical-Analyzer Generator Lex Use of Lex
Structure of Lex
Lex Demonstration