• Tidak ada hasil yang ditemukan

Generating the sentence

Using associative containers

7.1 Containers that support efficient look-up

7.4.3 Generating the sentence

Having read all the input, we must next generate a random sentence. We know that our input will be a grammar, and that we want to produce a sentence. Our output will be a

vector<string> that represents the output sentence.

That's the easy part. The more interesting problem is how the function should work. We know that initially we'll need to find a rule that corresponds to <sentence>. Moreover, we know that we are going to build our output in pieces, which we will assemble from various rules and parts of rules.

In principle, we could concatenate those pieces to form our result. However, because there is no built-in concatenation operation for vectors, we will start with an empty vector and call push_back repeatedly on it.

These two constraints—starting with <sentence>, and calling push_back repeatedly on an initially empty vector—suggest that we are going to want to define our sentence generator in terms of an auxiliary function, which we will call as follows:

vector<string> gen_sentence(const Grammar& g) {

vector<string> ret;

gen_aux(g, "<sentence>", ret);

return ret;

}

In effect, the call to gen_aux is a request to use the grammar g to generate a sentence according to the <sentence> rule, and to append that sentence to ret.

Our remaining task is to define gen_aux. Before we do so, we note that gen_aux will have to determine whether a word represents a category, which it will do by checking whether the

word is bracketed. We shall, therefore, define a predicate to do so:

bool bracketed(const string& s) {

return s.size() > 1 && s[0] == '<' && s[s.size() - 1] == '>';

}

The job of gen_aux is to expand the input string that it is given as its second argument by looking up that string in the grammar that is its first parameter and placing its output into its third parameter. By "expand" we mean the process that we described in §7.4/129. If our string is bracketed, we then have to find a corresponding rule, which we'll expand in place of the bracketed category. If the input string is not bracketed, then the input itself is part of our output and can be pushed onto the output vector with no further processing:

void

gen_aux(const Grammar& g, const string& word, vector<string>& ret) {

if (!bracketed(word)) { ret.push_back(word);

} else {

// locate the rule that corresponds to word Grammar::const_iterator it = g.find(word);

if (it == g.end())

throw logic_error("empty rule");

// fetch the set of possible rules

const Rule_collection& c = it->second;

// from which we select one at random const Rule& r = c[nrand(c.size())];

// recursively expand the selected rule

for (Rule::const_iterator i = r.begin(); i != r.end(); ++i) gen_aux(g, *i, ret);

} }

Our first job is trivial: If the word is not bracketed, it represents itself, so we can append it to ret and we're done. Now comes the interesting part: finding in g the rule that corresponds to our word. You might think that we could simply refer to g[word], but doing so would give us the wrong result. Recall from §7.2/125 that when you try to index a map with a nonexistent key, it automatically creates an element with that key. That will never do in this case, because we don't want to litter our grammar with spurious rules. Moreover, g is a const map, so even if

we wanted to create new entries, we couldn't do so. Indeed, [] isn't even defined on a const map.

Evidently, we must use a different facility: The find member of the map class looks for the element, if any, with the given key, and returns an iterator that refers to that element if it can find one. If no such element exists in g, the find algorithm returns g.end(). The comparison between it and g.end(), therefore, serves to ensure that the rule exists. If it doesn't exist, that means the input was inconsistent—it used a bracketed word without a corresponding rule—so we throw an exception.

At this point, it is an iterator that refers to an element of g, which is a map. Dereferencing this iterator yields a pair, the second member of which is the value of the map element. Therefore, it->second denotes the collection of rules that correspond to this category. For convenience, we define a reference named c as a synonym for this object.

Our next job is to select a random element from this collection, which we do in the initialization of r. This code

const Rule& r = c[nrand(c.size())];

is unfamiliar, and is, therefore/worth a close look. First, recall that we defined c to be a Rule_collection, which is a kind of vector. We call a function named nrand, which we will define in §7.4.4/135, to select a random element of this vector. When we give nrand an argument n, it returns a random integer in the range [0, n). Finally, we define r as a synonym for that element.

Our final task in gen_aux is to examine every element of r. If the element is bracketed, we have to expand it into a sequence of words; otherwise, we append it to ret. What may seem like magic on first reading is that this processing is exactly what we are doing in

gen_aux—and therefore, we can call gen_aux to do it!

Such a call is called recursive, and it is one of those techniques that looks like it can't possibly work—until you've tried it a few times. To convince yourself that this function works, begin by noting that the function obviously works if word is not bracketed.

Next, assume that word is bracketed, but its rule's right-hand side has no bracketed words of its own. It should still be easy to see that the program will work in this case, because when it makes each recursive call, the gen_aux that it calls will immediately see that the word is not bracketed. Therefore, it will append the word to ret and return.

The next step is to assume that word refers to a slightly more complicated rule—one that uses bracketed words in its right-hand side, but only words that refer to rules with no

bracketed words of their own. When you encounter a recursive call to gen_aux, do not try to figure out what it does. Instead, remember that you have already convinced yourself that it

works in this case, because you know that at worst, its argument is a category that does not lead to any further bracketed words. Eventually, you will see that the function works in all cases, because each recursive call simplifies the argument.

We do not know any sure way to explain recursion. Our experience is that people stare at recursive programs for a long time without understanding how they work. Then, one day, they suddenly get it—and they don't understand why they ever thought it was difficult. Evidently, the key to understanding recursion is to begin by understanding recursion. The rest is easy.

Having written gen_sentence, read_grammar, and the associated auxiliary functions, we'll want to use them:

int main() {

// generate the sentence

vector<string> sentence = gen_sentence(read_grammar(cin));

// write the first word, if any

vector<string>::const_iterator it = sentence.begin();

if (!sentence.empty()) { cout << *it;

++it;

}

// write the rest of the words, each preceded by a space while (it != sentence.end()) {

cout << " " << *it;

++it;

}

cout << endl;

return 0;

}

We read the grammar, generate a sentence from it, and then write the sentence a word at a time. The only even minor complexity is that we put a space in front of the second and subsequent words of the sentence.