The idea of the stationary points of a function of just one variable has already been discussed in subsection 2.1.8. We recall that the functionf(x) has a stationary point atx = x0if its gradientdf/dx is zero at that point. A function may have any number of stationary points, and their nature, i.e. whether they are maxima, minima or stationary points of inflection, is determined by the value of the second derivative at the point. A stationary point is
(i) a minimum ifd2f/dx2> 0;
(ii) a maximum ifd2f/dx2< 0;
(iii) a stationary point of inflection ifd2f/dx2= 0 and changes sign through the point.
We now consider the stationary points of functions of more than one variable;
we will see that partial differential analysis is ideally suited to the determination of the position and nature of such points. It is helpful to consider first the case of a function of just two variables but, even in this case, the general situation is more complex than that for a function of one variable, as can be seen from figure 5.2.
This figure shows part of a three-dimensional model of a functionf(x, y). At positionsP and B there are a peak and a bowl respectively or, more mathemati-cally, a local maximum and a local minimum. At positionS the gradient in any direction is zero but the situation is complicated, since a section parallel to the planex = 0 would show a maximum, but one parallel to the plane y = 0 would show a minimum. A point such asS is known as a saddle point. The orientation of the ‘saddle’ in thexy-plane is irrelevant; it is as shown in the figure solely for ease of discussion. For any saddle point the function increases in some directions away from the point but decreases in other directions.
5.8 STATIONARY VALUES OF MANY-VARIABLE FUNCTIONS
P
S
y
x
B
Figure 5.2 Stationary points of a function of two variables. A minimum occurs atB, a maximum at P and a saddle point at S .
For functions of two variables, such as the one shown, it should be clear that a necessary condition for a stationary point (maximum, minimum or saddle point) to occur is that
∂f
∂x = 0 and ∂f
∂y = 0. (5.21)
The vanishing of the partial derivatives in directions parallel to the axes is enough to ensure that the partial derivative in any arbitrary direction is also zero. The latter can be considered as the superposition of two contributions, one along each axis; since both contributions are zero, so is the partial derivative in the arbitrary direction. This may be made more precise by considering the total differential
df = ∂f
∂xdx +∂f
∂ydy.
Using (5.21) we see that although the infinitesimal changes dx and dy can be chosen independently the change in the value of the infinitesimal functiondf is always zero at a stationary point.
We now turn our attention to determining the nature of a stationary point of a function of two variables, i.e. whether it is a maximum, a minimum or a saddle point. By analogy with the one-variable case we see that ∂2f/∂x2 and∂2f/∂y2 must both be positive for a minimum and both be negative for a maximum.
However these are not sufficient conditions since they could also be obeyed at complicated saddle points. What is important for a minimum (or maximum) is that the second partial derivative must be positive (or negative) inall directions, not just in thex- and y- directions.
To establish just what constitutes sufficient conditions we first note that, since f is a function of two variables and ∂f/∂x = ∂f/∂y = 0, a Taylor expansion of the type (5.18) about the stationary point yields
f(x, y)− f(x0, y0)≈ 1 2!
(∆x)2fxx+ 2∆x∆yfxy+ (∆y)2fyy ,
where ∆x = x− x0and ∆y = y− y0and where the partial derivatives have been written in more compact notation. Rearranging the contents of the bracket as the weighted sum of two squares, we find
f(x, y)− f(x0, y0)≈1 2
fxx
∆x +fxy∆y fxx
2
+ (∆y)2
fyy−fxy2
fxx
.
(5.22) For a minimum, we require (5.22) to be positive for all ∆x and ∆y, and hence fxx> 0 and fyy− (fxy2/fxx)> 0. Given the first constraint, the second can be written fxxfyy > f2xy. Similarly for a maximum we require (5.22) to be negative, and hencefxx< 0 and fxxfyy> f2xy. For minima and maxima, symmetry requires thatfyyobeys the same criteria asfxx. When (5.22) is negative (or zero) for some values of ∆x and ∆y but positive (or zero) for others, we have a saddle point. In this case fxxfyy < fxy2. In summary, all stationary points havefx =fy= 0 and they may be classified further as
(i) minima if bothfxxandfyy are positiveandf2xy< fxxfyy, (ii) maxima if bothfxxandfyyare negativeandfxy2 < fxxfyy, (iii) saddle points iffxxandfyyhave opposite signsorf2xy> fxxfyy.
Note, however, that iffxy2 =fxxfyy thenf(x, y)− f(x0, y0) can be written in one of the four forms
±1 2
∆x|fxx|1/2± ∆y|fyy|1/22
.
For some choice of the ratio ∆y/∆x this expression has zero value, showing that, for a displacement from the stationary point in this particular direction, f(x0+ ∆x, y0+ ∆y) does not differ from f(x0, y0) to second order in ∆x and
∆y; in such situations further investigation is required. In particular, if fxx, fyy and fxy are all zero then the Taylor expansion has to be taken to a higher order. As examples, such extended investigations would show that the function f(x, y) = x4+y4 has a minimum at the origin but thatg(x, y) = x4+y3 has a saddle point there.
5.8 STATIONARY VALUES OF MANY-VARIABLE FUNCTIONS
Show that the function f(x, y) = x3exp(−x2−y2)has a maximum at the point ( 3/2, 0), a minimum at (−
3/2, 0) and a stationary point at the origin whose nature cannot be determined by the above procedures.
Setting the first two partial derivatives to zero to locate the stationary points, we find
∂f
∂x= (3x2− 2x4) exp(−x2− y2) = 0, (5.23)
∂f
∂y =−2yx3exp(−x2− y2) = 0. (5.24) For (5.24) to be satisfied we requirex = 0 or y = 0 and for (5.23) to be satisfied we require x = 0 or x =±
3/2. Hence the stationary points are at (0, 0), (
3/2, 0) and (− 3/2, 0).
We now find the second partial derivatives:
fxx= (4x5− 14x3+ 6x) exp(−x2− y2), fyy=x3(4y2− 2) exp(−x2− y2), fxy= 2x2y(2x2− 3) exp(−x2− y2).
We then substitute the pairs of values ofx and y for each stationary point and find that at (0, 0)
fxx= 0, fyy= 0, fxy= 0 and at (±
3/2, 0) fxx=∓6
3/2 exp(−3/2), fyy=∓3
3/2 exp(−3/2), fxy= 0.
Hence, applying criteria (i)–(iii) above, we find that (0, 0) is an undetermined stationary point, (
3/2, 0) is a maximum and (−
3/2, 0) is a minimum. The function is shown in figure 5.3.
Determining the nature of stationary points for functions of a general number of variables is considerably more difficult and requires a knowledge of the eigenvectors and eigenvalues of matrices. Although these are not discussed until chapter 8, we present the analysis here for completeness. The remainder of this section can therefore be omitted on a first reading.
For a function of n real variables, f(x1, x2, . . . , xn), we require that, at all stationary points,
∂f
∂xi = 0 for allxi.
In order to determine the nature of a stationary point, we must expand the function as a Taylor series about the point. Recalling the Taylor expansion (5.20) for a function ofn variables, we see that
∆f = f(x)− f(x0)≈ 1 2
i
j
∂2f
∂xi∂xj∆xi∆xj. (5.25)
minimum x
−1 1 2 3 y
−2
2
−3 −2
−0.2
−0.4 0.2 0.4
0 0
0
maximum
Figure 5.3 The functionf(x, y) = x3exp(−x2− y2).
If we define the matrix M to have elements given by Mij= ∂2f
∂xi∂xj, then we can rewrite (5.25) as
∆f =12∆xTM∆x, (5.26)
where ∆x is the column vector with the ∆xi as its components and ∆xT is its transpose. Since M is real and symmetric it has n real eigenvalues λr and n orthogonal eigenvectors er, which after suitable normalisation satisfy
Mer=λrer, eTres=δrs,
where the Kronecker delta, written δrs, equals unity for r = s and equals zero otherwise. These eigenvectors form a basis set for the n-dimensional space and we can therefore expand ∆x in terms of them, obtaining
∆x =
r
arer,