An Example of Compilation - Preface to the online edition

1.7 An Example of Compilation

In the rest of this chapter I show, very briefly, the action of some of the phases of compilation taking as example input the program of figure 1.4. First of all the characters of the program are read in, then these characters are lexically analysed into separate items. Some of the items will represent source program identifiers and will include a pointer to the symbol table.³ Figure 1.5 shows part of the results of lexical analysis: in most languages subdivision of the input into items can be performed without regard to the context in which the name occurs.

Next, the syntax analyser examines the sequence of items produced by the lexical analyser and discovers how the items relate together to form ‘phrase’ fragments, how the phrases inter-relate to form larger phrases and so on. The most general description of these relationships of the fragments is a tree, as shown in figure 1.6.

Each node of the tree includes a tag or type field which identifies the kind of source program fragment described by that node, together with a number of pointers to nodes which describe the subphrases which make it up. Figure 1.7 shows how part of the tree shown in figure 1.6 might actually be represented as a data structure. There are of course a number of different ways of representing the same tree, and that shown in figure 1.7 is merely an example. There are as many different ways of drawing the structure, and throughout most of this book I shall show trees in the manner of figure 1.6. The only significant differences between the two picturings is that, in the first, nodes aren’t shown as sequences of boxes and names aren’t shown as pointers to the symbol table. Figure 1.7 is perhaps more faithful to reality, so it should be borne in mind whenever you encounter a simplified representation like that in figure 1.6.

After syntax analysis, the object description phase takes the tree and the symbol table entries produced by the lexical analyser. It analyses the declarative nodes in the tree, producing descriptive information in the symbol table as shown in figure 1.8. Standard procedures, such as ‘print’, may receive a default declara- tion before translation: figure 1.8 shows a possible entry in the symbol table.

Note that, since the tree contains a pointer to the symbol table in each node which contains a reference to an identifier, neither object description phase nor translator need search the symbol table but need merely to follow the pointer to the relevant entry.

After the object description phase has filled in the descriptors in the symbol table, a simple translation phase can take the tree of figure 1.7 together with the symbol table of figure 1.8 and produce an instruction sequence like that shown in figure 1.9.⁴

3 The item which represents a number may also contain a pointer to a table which contains a representation of the number. For simplicity figures 1.5 and 1.6 show the value of the number as part of the lexical item itself.

4 See appendix B for a brief description of the assembly code instructions used in this and

18 CHAPTER 1. PHASES AND PASSES The addresses used in the instructions need finally to be relocated by the loader.

Suppose that, when the program is loaded into store, its memory cell space starts at address 23. Then the first line of figure 1.9 would be converted into

‘LOAD 1, 24’ and the last line into ‘STORE 1, 23’.

Optimisation of the object code in this case could produce an enormous im- provement in its execution efficiency. The total effect of the program is to print the number ‘4’. By looking at the way in which the values of variables are used throughout the program and by deferring translation of assignment statements until their result is required – in this case they are never required – an optimisation phase could reduce the program just to the single statement ‘print(4)’.

Optimisation is a mighty sledgehammer designed for bigger nuts than this example, of course, and it is always an issue whether the expense of optimisation is worth it: in the case of figure 1.4 it would certainly cost more to optimise the program than it would to run it!

Summary

The underlying organisation of compilers is simple and modular. This chapter discusses how the various phases cooperate so that later chapters can concen- trate on the separate phases in isolation.

Input and lexical analysis is discussed in chapters 4 and 8; syntax analysis in chapters 3, 16, 17 and 18; object description in chapter 8; translation in chapters 5, 6, 7 and 9; optimisation in chapter 10; loading in chapter 4; run-time support in chapters 11, 12, 13 and 14; run-time debugging in chapter 20.

other examples.

Chapter 2

Introduction to Translation

The most important task that a compiler performs is to translate a program from one language into another – from source language to object language.

Simple translationis a mechanism which takes a representation of a fragment of the source program and produces an equivalent fragment in the object language – a code fragment which, when executed by the object machine, will perform the operations specified by the original source fragment.

Since the object program produced by a simple translator consists of a sequence of relatively independent object code fragments, it will be less efficient than one produced by a mechanism which pays some attention to the context in which each fragment must operate. Optimisationis a mechanism which exists to cover up the mistakes of simple translation: it translates larger sections of program than the simple translator does, in an attempt to reduce the object code inefficiencies caused by poor interfacing of code fragments.

In order to be able to produce object code phrases the translator must have access to asymbol tablewhich provides a mapping from source program names to the run-time objects which they denote. This table is built by the lexical analyser (see chapters 4 and 8) which correlates the various occurrences of each name throughout the program. The mapping to run-time objects is provided by the object description phase (see chapter 8) which processes declarative information from the source program to associate each identifier in the symbol table with a description of a run-time object.

This chapter introduces the notion of a ‘tree-walking’ translator, which I believe is a mechanism that is not only easy to construct but which can readily and reliably produce efficient object code fragments. Such a translator consists of a number of mutually recursive procedures, each of which is capable of translating one kind of source program fragment and each of which can generate a variety of different kinds of object code fragments depending on the detailed structure of the source fragment which is presented to it.

Because the process of translation depends on the selection at each point of one 19

20 CHAPTER 2. INTRODUCTION TO TRANSLATION Statement:

if hour*60+minute=1050 or tired then leave(workplace)

conditional statement

[EXPRESSION]

Boolean or

[STATEMENT]

procedure call

[LEFT]

relation =

[RIGHT]

name

tired [LEFT]

arithmetic +

[RIGHT]

number

[PROCEDURE]

name

[ARGUMENTS]

name

leave workplace

[LEFT] 1050 arithmetic *

[RIGHT]

name minute [LEFT]

name

hour

[RIGHT]

number 60

Figure 2.1: Tree describing a simple statement

of a number of possible code fragments translators tend to be voluminous, but since the selection of a fragment is fairly simple and generating the instructions is very straightforward, they tend to run very rapidly. A simple translator will usually use less than 15% of the machine time used by all phases of the compiler together, but will make up more than 50% of the compiler’s own source code.

Dalam dokumen Preface to the online edition (Halaman 33-36)