• Tidak ada hasil yang ditemukan

The meaning of students.erase (students.begin () + i)

Using sequential containers and analyzing strings

5.1 Separating students into categories

5.2.4 The meaning of students.erase (students.begin () + i)

Now that we understand more about iterators, we can see the real point of students.erase(students.begin() + i);

in the program in §5.1.1/77. We've already seen that students.begin() is an iterator that refers to the initial element of students, and that students.begin() + i refers to the ith element of students. What is important to realize is that this latter expression gets its meaning from the definition of + on the types of students.begin() and i. In other words, the iterator and index types determine the meaning of + in this expression.

If students were a container that did not support random-access indexing, it is likely that students.begin() would be of a type that did not have + defined—in which case the

expression students.begin() + i would not compile. In effect, such a container would be able to shut off random access to its elements, while still allowing sequential access through iterators.

5.3 Using iterators instead of indices

Using what we have learned about iterators, and one more new fact, we can reimplement the extract_fails function in a way that does not use indexing at all:

// version 3: iterators but no indexing; still potentially slow

vector<Student_info> extract_fails(vector<Student_info>& students) {

vector<Student_info> fail;

vector<Student_info>::iterator iter = students.begin();

while (iter != students.end()) { if (fgrade(*iter)) {

fail.push_back(*iter);

iter = students.erase(iter);

} else ++iter;

}

return fail;

}

We start by defining fail as we did before. Next, we define the iterator, named iter, that we'll use—in place of an index—to look at the elements in students. Note that we give it type iterator instead of const_iterator:

vector<Student_info>::iterator iter = student.begin();

because we intend to use it to modify students, which we do in the call to erase. We initialize iter to denote the first element in students.

We continue with a while statement that will look at every element of students. Remember that iter is an iterator that denotes an element in the container, so *iter is the value of that element. To decide whether a student passed or failed, we pass that value to fgrade. Similarly, we changed the code that copies the failing records into fail by writing

fail.push_back(*iter) ; // dereference the iterator to get the element

instead of

fail.push_back(students[i]); // index into the vector to get the element The erase has gotten simpler, because we now have an iterator to pass directly:

iter = students.erase(iter);

We no longer have to calculate an iterator by adding the index i to students.begin().

The new fact that we used here is easy to overlook, but crucially important: We now assign to iter the value that erase returns. Why?

A bit of thinking should convince us that removing the element that iter denoted must

invalidate that iterator. After we have called students.erase(iter), we know that iter can no longer refer to the same element because that element is gone. In fact, calling erase on a vector invalidates all iterators that refer to elements after the one that was just erased. If you look back at the diagram in §5.1.1/78, it should be obvious that after we erase the element marked FAIL, that element is gone, and each of the elements after it has moved. If the elements have moved, any iterators referring to them must be meaningless as well.

Fortunately, erase returns an iterator that is positioned on the element that follows the one that we just erased. Therefore, executing

iter = students.erase(iter);

makes iter refer to the element after the erasure, which is exactly what we need.

If we're dealing with an element that did not represent a failing grade, then we still need to increment iter so that we'll be positioned on the next element for the next trip through the loop.

We do so by incrementing iter in the else branch.

Incidentally, as in §5.1.1/78, we might be tempted to optimize the loop by saving the value of students.end() to avoid evaluating it each time through the while. In other words, we might be tempted to change

while (iter != students.end())

to

// this code will fail because of misguided optimization vector<Student_info>::iterator iter = students.begin(), end_iter = students.end();

while (iter != end_iter) { // . . .

}

This loop will almost surely fail at run time. Why?

The reason is that if we ever execute students.erase, doing so will invalidate every iterator after the point erased, including end_iter! Therefore, it is essential that we call students.end

each time through the loop, just as it was essential in §5.1.1/78 to call students.size each time through the loop.

5.4 Rethinking our data structure for better performance

For small inputs, our implementation works fine. However, as we said in §5.1.1/77, as our input grows, the performance degrades substantially. Why?

Let's think again about using erase to remove an element from a vector. The library optimizes the vector data structure for fast access to arbitrary elements. Moreover, we saw in §3.2.3/48 that vectors perform well when growing a vector one element at a time, as long as elements are added at the end of the vector.

Inserting or removing elements from the interior of a vector is another story. Doing so requires that all elements after the one inserted or removed be moved in order to preserve fast random access. Moving elements means that the run time of our new code might be as slow as

quadratic in the number of elements in the vector. For small inputs, we might not notice, but each time the size of our input doubles, the execution time can quadruple. If we ask our program to deal with all the students in a school rather than just the students in a class, even a fast computer will take too long to execute the program.

If we want to do better, we need a data structure that lets us insert and delete elements efficiently anywhere in the container. Such a container is unlikely to support random access through indices. Even if it did so, integer indices would be less than useful, because inserting and deleting elements would have to change the indices of other elements. Now that we know how to use iterators, we have a way of dealing with such a data structure that does not

provide index operations.

5.5 The list type

By rewriting the code to use iterators, we have removed our reliance on indices. We now need to reimplement our program using a data structure that will let us delete elements efficiently from within the container.

The need to insert or delete elements inside a data structure is pretty common. Not

surprisingly, the library provides a type, named list and defined in the <list> header, that is optimized for this kind of access.

Just as vectors are optimized for fast random access, lists are optimized for fast insertion and deletion anywhere within the container. Because lists have to maintain a more complicated structure, they are slower than vectors if the container is accessed only sequentially. That is, if the container grows and shrinks only or primarily from the end, a vector will outperform a list. However, if a program deletes many elements from the middle of the container—as our program does—then lists will be faster for large inputs, becoming much faster as the inputs grow.

Like a vector, a list is a container that can hold objects of most.any type. As we'll see, lists and vectors share many operations. As a result, we can often translate programs that operate on vectors into programs that operate on lists, and vice versa. Often, all that changes is our variables' types.

One key operation that vectors support, but lists do not, is indexing. As we just saw, we can write a version of extract_fails that uses vectors to extract records that correspond to failing students, but that uses iterators instead of indices. It turns out that we can transform that version of extract_fails to use lists instead of vectors merely by changing the appropriate types:

// version 4: use list instead of vector

list<Student_info> extract_fails(list<Student_info>& students) {

list<Student_info> fail;

list<Student_info>::iterator iter = students.begin();

while (iter != students.end()) { if (fgrade(*iter)) {

fail.push_back(*iter);

iter = students.erase(iter);

} else ++iter;

}

return fail;

}

If we compare this code with the version from §5.3/82, we see that the only change is to replace vector by list in the first four lines. So, for example, the return type and the parameter to the function are now list<Student_info>, as is the local container fail, into which we put the failing grades. Similarly, the type of the iterator is the one defined by the list class. Hence, we define iter as the iterator type that is a member of list<Student_info>. The list type is a template, so we must say what kind of object the list holds by naming that type inside angle brackets, just as we do when we define a vector.

There are no changes in the program's logic. Of course, our caller will now have to provide us with a list, and will get a list in return. Moreover, the details of how the library implements the operations are quite different, because this version operates on lists and the other ones operate on vectors. When we execute ++iter, we are doing whatever it means to advance the iterator to the next element in the list. Similarly,

iter = students.erase(iter);

calls the list version of erase and assigns the list iterator returned from erase into iter. The implementations of the increment and erase operations will surely differ from their vector counterparts.