THE METALANGUAGE: LANGUAGE PROCESSING
5.1 SYNTAX ANO SEMANTICS
Perhaps the best description begins with an example •
. In BIBLIO,we would I ike to ask the question
Generalization of grammar?
to find the list of subjects which have been stated to immediately include the subject grammar in their extent. {By the examples of Chapter
III,these would be syntax directed interpretation and compi
Iing.) Clearly, we would find a language implementation based strictly on simple patterns rather strained, so the rules of grammar wi 11 be sufficiently general to accept not onJy relatively simple queries
Iike the above, but much more complex, composed queries. Below
is
aportion of the syntax of BIBLIO.
The distinguished symbol of BIBLIO, as of every language defined using LWL, is <sentence>. Thus, one rule wi 11 be
<sentence>::= <query>?
which sp
ecifies that one valid kind of sentence of BIBLIO is a <query>
fol lowed by a question mark. Since the answers to queries wi
I Ibe
Iists of authors,
Iists of publications or
Iists of subjects, we need corresponding syntactic rules •
.For the above query, we require only
<query>::= <q_subject>
where <q_subject> is
alist of subjects. Then,
·to aliow
asimple query Ii ke*
*Recal I from Chapter III that a query merely naming a subject
asks for aIi st of al I subjects covered by it.
Grammar?
we wi I I have a rule
<q_subject> ::= <subject>
Finally, to define the "general ization
11query, we need
<q_subject> ::=generalization of <q_subject>
With the above fragmentary grammar, and assuming a lexical rule corresponding to
<subject>::= grammar
our example sentence is analyzed in the fol lowing form:
<sentence>
<query>
<q_subject>
<q_sub j ec t>
<subject>
·
General izati~n of grammar ?
Notice that the generality we have introduced creates som~ extra levels of ana I ys is in the grammar; however, it immediate I
ya I I ows more comp I
exqueries like
Generalization of generalization of grammar?
Generalization of generalization of generalization of grammar?
Al though the grammar above determines the structure of our
sentence, it says nothing about how its meaning (in this case, a reply
to the user) is to
.be computed. We rectify this
byassociating 1-1ith
each rule the name of a function which wil I carry out the computation imp I ied by the syntactic transformation.
fragment, as actually written in
LWL,
is below.The resulting grammar
query_forming rule <sentence>::= <query>'?' (print}
subject_query rule <query>::= <q_subject>: (format_subjects)
generalization ru~e <q_subject> ::= 'generalization o f ' <q_subject>
: (genera Ii ze) ·
primitive_subject rule <q_subject> ::=<subject>: {single_subject) Each rule is named, so that it may later be referenced for modification or debugging. Text that is literally mentioned in the left or right hand side of a rule {i.e., terminal symbols) is _expressed by quoted strings, since spaces in the object language are significant but spaces in
LWL
are not. The i n for ma t i on fo I I ow i ng the " :I' i s the · s em ant i c specification; in this case, the names of the appropriate functions.These ru I es have the side effect of specifying the resu It types and constituent types of the functions they mention. In genera I rewr i te rules, one semantic specification appears for each non-terminal phrase in the left hand side. An omitted semantic specification implies the identity semantic function.
is
With the above grammar, the meaning of the question Generalization of grammar?
print( produce_subjects( generalize(. single_subject{grammar) ) ) );
The phrase marker which represents this computation is produced
by
the I anguage processor as a resu It of the syntactic ana I ys is be fore anyactual evaluatio~ is attempted; thus, syntactic and semantic processing do not ord i nar i I y proceed in para I I e I • This may save cons i derab I e semantic computation if spurious part i a I parses can be rejected for purely syntactic reasons before any semantic computation takes place.
A I so, this convention minimizes the prob I em of "undoing" which can haunt many syntax directed compilers; having to undo falsely hypothesized act i ans based on spurious parses is usua I I y av~ i ded by adopting very
. simple bounded context grammars. As we shal I see below, the processing
of semantics synchronously with syntax is possible, though not mandatory.
Let us return again to Chomsky's I ist, and take up the discussion more specifically. His first question, in our context, asks,
"What are the strings of symbols to be considered?" The terminal symbols of every object language are the printable characters of the host computer (the EBCDIC printable characters), augmented by the symbols <string_begin>, <string_end>, <input_terminator> and
<carriage_return>. <s tr i ng_beg in> and <s tr i ng_end> are automa ti ca I I y appended around the input string, for the convenience of syntactic ana I ys is. < i nput_term i nator> is a REL-recognized symbo I by which the user indicates the end of his input sentence, and <carr i age_return> ·is the end of I ine.* The complete vocabulary of every language is encoded
*Carri age return is ord i nar i I y a nor·ma I input character, and a spec i a I
<input_terminator> is required to initiate processing of an input sentence. In TSO, this is (control)-S. Note that most EBCDIC devices do not have lower case characters or some of the special symbols used in the description of LWL. For