Don’t try to return a reference when you must return an object

ptg7544714 release, because its internal implementation may change. Things can

even change when you switch to a different C++ implementation. As I write this, for example, some implementations of the standard library’s string type are seven times as big as others.

In general, the only types for which you can reasonably assume that pass-by-value is inexpensive are built-in types and STL iterator and function object types. For everything else, follow the advice of this Item and prefer pass-by-reference-to-const over pass-by-value.

Things to Remember

✦Prefer pass-by-reference-to-const over pass-by-value. It’s typically more efficient and it avoids the slicing problem.

✦The rule doesn’t apply to built-in types and STL iterator and function object types. For them, pass-by-value is usually appropriate.

Item 21: Don’t try to return a reference when you

ptg7544714 to pay for such an object if you don’t have to. So the question is this:

do you have to pay?

Well, you don’t have to if you can return a reference instead. But remember that a reference is just a name, a name for some existing object. Whenever you see the declaration for a reference, you should immediately ask yourself what it is another name for, because it must be another name for something. In the case of operator*, if the function is to return a reference, it must return a reference to some Rational object that already exists and that contains the product of the two objects that are to be multiplied together.

There is certainly no reason to expect that such an object exists prior to the call to operator*. That is, if you have

Rational a(1, 2); // a = 1/2

Rational b(3, 5); // b = 3/5

Rational c = a * b; // c should be 3/10

it seems unreasonable to expect that there already happens to exist a rational number with the value three-tenths. No, if operator*^{is to} return a reference to such a number, it must create that number object itself.

A function can create a new object in only two ways: on the stack or on the heap. Creation on the stack is accomplished by defining a local variable. Using that strategy, you might try to write operator*^{this way:}

const Rational& operator*(const Rational& lhs, // warning! bad code!

const Rational& rhs) {

Rational result(lhs.n * rhs.n, lhs.d * rhs.d);

return result;

}

You can reject this approach out of hand, because your goal was to avoid a constructor call, and result will have to be constructed just like any other object. A more serious problem is that this function returns a reference to result, but result is a local object, and local objects are destroyed when the function exits. This version of operator*^{, then,} doesn’t return a reference to a Rational — it returns a reference to an ex-Rational; a former Rational; the empty, stinking, rotting carcass of what used to be a Rational but is no longer, because it has been destroyed. Any caller so much as glancing at this function’s return value would instantly enter the realm of undefined behavior. The fact is, any function returning a reference to a local object is broken. (The same is true for any function returning a pointer to a local object.)

ptg7544714 Let us consider, then, the possibility of constructing an object on the

heap and returning a reference to it. Heap-based objects come into being through the use of new, so you might write a heap-based operator* like this:

const Rational& operator*(const Rational& lhs, // warning! more bad const Rational& rhs) // code!

{

Rational *result = new Rational(lhs.n * rhs.n, lhs.d * rhs.d);

return *result;

}

Well, you still have to pay for a constructor call, because the memory allocated by new is initialized by calling an appropriate constructor, but now you have a different problem: who will apply delete to the object conjured up by your use of new?

Even if callers are conscientious and well intentioned, there’s not much they can do to prevent leaks in reasonable usage scenarios like this:

Rational w, x, y, z;

w = x * y * z; // same as operator*(operator*(x, y), z) Here, there are two calls to operator* in the same statement, hence two uses of new that need to be undone with uses of delete. Yet there is no reasonable way for clients of operator* to make those calls, because there’s no reasonable way for them to get at the pointers hidden behind the references being returned from the calls to operator*^{. This} is a guaranteed resource leak.

But perhaps you notice that both the on-the-stack and on-the-heap approaches suffer from having to call a constructor for each result returned from operator*. Perhaps you recall that our initial goal was to avoid such constructor invocations. Perhaps you think you know a way to avoid all but one constructor call. Perhaps the following implementation occurs to you, an implementation based on operator*

returning a reference to a static Rational object, one defined inside the function:

const Rational& operator*(const Rational& lhs, // warning! yet more const Rational& rhs) // bad code!

{

static Rational result; // static object to which a // reference will be returned result = ... ; // multiply lhs by rhs and put the

// product inside result return result;

}

ptg7544714 Like all designs employing the use of static objects, this one immedi-

ately raises our thread-safety hackles, but that’s its more obvious weakness. To see its deeper flaw, consider this perfectly reasonable client code:

bool operator==(const Rational& lhs, // an operator==

const Rational& rhs); // for Rationals Rational a, b, c, d;

...

if ((a * b) == (c * d)) {

do whatever’s appropriate when the products are equal;

} else {

do whatever’s appropriate when they’re not;

}

Guess what? The expression ((a*b) == (c*d))^willalways evaluate to true, regardless of the values of a, b, c, and d!

This revelation is easiest to understand when the code is rewritten in its equivalent functional form:

if (operator==(operator*(a, b), operator*(c, d)))

Notice that when operator== is called, there will already be two active calls to operator*, each of which will return a reference to the static Rational object inside operator*^{. Thus,}operator== will be asked to compare the value of the static Rational object inside operator* with the value of the static Rational object inside operator*. It would be surpris- ing indeed if they did not compare equal. Always.

This should be enough to convince you that returning a reference from a function like operator* is a waste of time, but some of you are now thinking, “Well, if one static isn’t enough, maybe a static array will do the trick....”

I can’t bring myself to dignify this design with example code, but I can sketch why the notion should cause you to blush in shame. First, you must choose n, the size of the array. If n is too small, you may run out of places to store function return values, in which case you’ll have gained nothing over the single-static design we just discredited. But if n is too big, you’ll decrease the performance of your program, because every object in the array will be constructed the first time the function is called. That will cost you n constructors and n destructors^†, even if the function in question is called only once. If “optimization” is the process of improving software performance, this kind of thing should be called “pessimization.” Finally, think about how you’d put the val-

† The destructors will be called once at program shutdown.

ptg7544714 ues you need into the array’s objects and what it would cost you to do

it. The most direct way to move a value between objects is via assignment, but what is the cost of an assignment? For many types, it’s about the same as a call to a destructor (to destroy the old value) plus a call to a constructor (to copy over the new value). But your goal is to avoid the costs of construction and destruction! Face it: this approach just isn’t going to pan out. (No, using a vector instead of an array won’t improve matters much.)

The right way to write a function that must return a new object is to have that function return a new object. For Rational’s operator*^{, that} means either the following code or something essentially equivalent:

inline const Rational operator*(const Rational& lhs, const Rational& rhs) {

return Rational(lhs.n * rhs.n, lhs.d * rhs.d);

}

Sure, you may incur the cost of constructing and destructing operator*’s return value, but in the long run, that’s a small price to pay for correct behavior. Besides, the bill that so terrifies you may never arrive. Like all programming languages, C++ allows compiler imple- menters to apply optimizations to improve the performance of the gen- erated code without changing its observable behavior, and it turns out that in some cases, construction and destruction of operator*^{’s return} value can be safely eliminated. When compilers take advantage of that fact (and compilers often do), your program continues to behave the way it’s supposed to, just faster than you expected.

It all boils down to this: when deciding between returning a reference and returning an object, your job is to make the choice that offers correct behavior. Let your compiler vendors wrestle with figuring out how to make that choice as inexpensive as possible.

Things to Remember

✦Never return a pointer or reference to a local stack object, a reference to a heap-allocated object, or a pointer or reference to a local static object if there is a chance that more than one such object will be needed. (Item 4 provides an example of a design where returning a reference to a local static is reasonable, at least in single-threaded environments.)

Dalam dokumen Book Praise for Effective C++, Third Edition (Halaman 111-115)