Stationary values of many-variable functions

The idea of the stationary points of a function of just one variable has already been discussed in subsection 2.1.8. We recall that the functionf(x) has a stationary point atx = x0if its gradientdf/dx is zero at that point. A function may have any number of stationary points, and their nature, i.e. whether they are maxima, minima or stationary points of inﬂection, is determined by the value of the second derivative at the point. A stationary point is

(i) a minimum ifd²f/dx²> 0;

(ii) a maximum ifd²f/dx²< 0;

(iii) a stationary point of inﬂection ifd²f/dx²= 0 and changes sign through the point.

We now consider the stationary points of functions of more than one variable;

we will see that partial differential analysis is ideally suited to the determination of the position and nature of such points. It is helpful to consider first the case of a function of just two variables but, even in this case, the general situation is more complex than that for a function of one variable, as can be seen from figure 5.2.

This ﬁgure shows part of a three-dimensional model of a functionf(x, y). At positionsP and B there are a peak and a bowl respectively or, more mathemati-cally, a local maximum and a local minimum. At positionS the gradient in any direction is zero but the situation is complicated, since a section parallel to the planex = 0 would show a maximum, but one parallel to the plane y = 0 would show a minimum. A point such asS is known as a saddle point. The orientation of the ‘saddle’ in thexy-plane is irrelevant; it is as shown in the ﬁgure solely for ease of discussion. For any saddle point the function increases in some directions away from the point but decreases in other directions.

5.8 STATIONARY VALUES OF MANY-VARIABLE FUNCTIONS

Figure 5.2 Stationary points of a function of two variables. A minimum occurs atB, a maximum at P and a saddle point at S .

For functions of two variables, such as the one shown, it should be clear that a necessary condition for a stationary point (maximum, minimum or saddle point) to occur is that

∂f

∂x = 0 and ∂f

∂y = 0. (5.21)

The vanishing of the partial derivatives in directions parallel to the axes is enough to ensure that the partial derivative in any arbitrary direction is also zero. The latter can be considered as the superposition of two contributions, one along each axis; since both contributions are zero, so is the partial derivative in the arbitrary direction. This may be made more precise by considering the total diﬀerential

df = ∂f

∂xdx +∂f

∂ydy.

Using (5.21) we see that although the inﬁnitesimal changes dx and dy can be chosen independently the change in the value of the inﬁnitesimal functiondf is always zero at a stationary point.

We now turn our attention to determining the nature of a stationary point of a function of two variables, i.e. whether it is a maximum, a minimum or a saddle point. By analogy with the one-variable case we see that ∂²f/∂x² and∂²f/∂y² must both be positive for a minimum and both be negative for a maximum.

However these are not suﬃcient conditions since they could also be obeyed at complicated saddle points. What is important for a minimum (or maximum) is that the second partial derivative must be positive (or negative) inall directions, not just in thex- and y- directions.

To establish just what constitutes suﬃcient conditions we ﬁrst note that, since f is a function of two variables and ∂f/∂x = ∂f/∂y = 0, a Taylor expansion of the type (5.18) about the stationary point yields

f(x, y)− f(x0, y0)≈ 1 2!

(∆x)²f_xx+ 2∆x∆yf_xy+ (∆y)²f_yy ,

where ∆x = x− x0and ∆y = y− y0and where the partial derivatives have been written in more compact notation. Rearranging the contents of the bracket as the weighted sum of two squares, we ﬁnd

f(x, y)− f(x0, y0)≈1 2

f_xx

∆x +f_xy∆y f_xx

+ (∆y)²

f_yy−f_xy²

f_xx

(5.22) For a minimum, we require (5.22) to be positive for all ∆x and ∆y, and hence fxx> 0 and fyy− (fxy²/fxx)> 0. Given the ﬁrst constraint, the second can be written fxxfyy > f²_xy. Similarly for a maximum we require (5.22) to be negative, and hencefxx< 0 and fxxfyy> f²_xy. For minima and maxima, symmetry requires thatf_yyobeys the same criteria asf_xx. When (5.22) is negative (or zero) for some values of ∆x and ∆y but positive (or zero) for others, we have a saddle point. In this case f_xxf_yy < f_xy². In summary, all stationary points havef_x =f_y= 0 and they may be classiﬁed further as

(i) minima if bothf_xxandf_yy are positiveandf²_xy< f_xxf_yy, (ii) maxima if bothf_xxandf_yyare negativeandf_xy² < f_xxf_yy, (iii) saddle points iff_xxandf_yyhave opposite signsorf²_xy> f_xxf_yy.

Note, however, that iff_xy² =f_xxf_yy thenf(x, y)− f(x0, y0) can be written in one of the four forms

±1 2

∆x|fxx|^1/2± ∆y|fyy|^1/22

For some choice of the ratio ∆y/∆x this expression has zero value, showing that, for a displacement from the stationary point in this particular direction, f(x0+ ∆x, y0+ ∆y) does not diﬀer from f(x0, y0) to second order in ∆x and

∆y; in such situations further investigation is required. In particular, if f_xx, f_yy and fxy are all zero then the Taylor expansion has to be taken to a higher order. As examples, such extended investigations would show that the function f(x, y) = x⁴+y⁴ has a minimum at the origin but thatg(x, y) = x⁴+y³ has a saddle point there.

5.8 STATIONARY VALUES OF MANY-VARIABLE FUNCTIONS

Show that the function f(x, y) = x³exp(−x²−y²)has a maximum at the point ( 3/2, 0), a minimum at (−

3/2, 0) and a stationary point at the origin whose nature cannot be determined by the above procedures.

Setting the ﬁrst two partial derivatives to zero to locate the stationary points, we ﬁnd

∂f

∂x= (3x²− 2x⁴) exp(−x²− y²) = 0, (5.23)

∂f

∂y =−2yx³exp(−x²− y²) = 0. (5.24) For (5.24) to be satisﬁed we requirex = 0 or y = 0 and for (5.23) to be satisﬁed we require x = 0 or x =±

3/2. Hence the stationary points are at (0, 0), (

3/2, 0) and (− 3/2, 0).

We now ﬁnd the second partial derivatives:

fxx= (4x⁵− 14x³+ 6x) exp(−x²− y²), fyy=x³(4y²− 2) exp(−x²− y²), fxy= 2x²y(2x²− 3) exp(−x²− y²).

We then substitute the pairs of values ofx and y for each stationary point and ﬁnd that at (0, 0)

fxx= 0, fyy= 0, fxy= 0 and at (±

3/2, 0) fxx=∓6

3/2 exp(−3/2), fyy=∓3

3/2 exp(−3/2), fxy= 0.

Hence, applying criteria (i)–(iii) above, we ﬁnd that (0, 0) is an undetermined stationary point, (

3/2, 0) is a maximum and (−

3/2, 0) is a minimum. The function is shown in ﬁgure 5.3.

Determining the nature of stationary points for functions of a general number of variables is considerably more diﬃcult and requires a knowledge of the eigenvectors and eigenvalues of matrices. Although these are not discussed until chapter 8, we present the analysis here for completeness. The remainder of this section can therefore be omitted on a ﬁrst reading.

For a function of n real variables, f(x1, x2, . . . , x_n), we require that, at all stationary points,

∂f

∂x_i = 0 for allx_i.

In order to determine the nature of a stationary point, we must expand the function as a Taylor series about the point. Recalling the Taylor expansion (5.20) for a function ofn variables, we see that

∆f = f(x)− f(x0)≈ 1 2

∂²f

∂x_i∂x_j∆x_i∆x_j. (5.25)

minimum x

−1 1 2 3 y

−2

−3 −2

−0.2

−0.4 0.2 0.4

0 0

maximum

Figure 5.3 The functionf(x, y) = x³exp(−x²− y²).

If we deﬁne the matrix M to have elements given by M_ij= ∂²f

∂x_i∂x_j, then we can rewrite (5.25) as

∆f =¹₂∆x^TM∆x, (5.26)

where ∆x is the column vector with the ∆xi as its components and ∆x^T is its transpose. Since M is real and symmetric it has n real eigenvalues λr and n orthogonal eigenvectors er, which after suitable normalisation satisfy

Mer=λ_rer, e^T_res=δ_rs,

where the Kronecker delta, written δ_rs, equals unity for r = s and equals zero otherwise. These eigenvectors form a basis set for the n-dimensional space and we can therefore expand ∆x in terms of them, obtaining

∆x =

a_rer,

Dalam dokumen This page intentionally left blank (Halaman 192-197)