DEFINITION 2.1 I NTUITIVE DEFINITION OF LIMIT ) The equation
2.4 Properties; Higher-order Partial Derivatives
First argue that, for i=1, . . . ,m, we have limx→aFi(x)=0. Next, argue that
xlim→aFi(x)=0 implies lim
h→0Fi(a+hej)=0, where ej denotes the standard basis vector (0, . . . ,1, . . . ,0) forRn.
(c) Use parts (a) and (b) to show thatai j= ∂fi
∂xj
(a), whereai jdenotes thei jth entry ofA. (Hint: Break into cases whereh>0 and whereh<0.)
2.4
Properties; Higher-order Partial
Note also that the functionk=3gmust be differentiable everywhere by part 2 of Proposition 4.1. We can readily check thatDk(x,y)=3Dg(x,y): We have
k(x,y)=(3x2+3y2,3yex y,6x3−21y5). Hence,
Dk(x,y)=
⎡
⎢⎣
6x 6y
3y2ex y 3ex y+3x yex y 18x2 −105y4
⎤
⎥⎦
=3
⎡
⎢⎣
2x 2y
y2ex y ex y+x yex y 6x2 −35y4
⎤
⎥⎦
=3Dg(x,y). ◆
Due to the nature of matrix multiplication, general versions of the product and quotient rules do not exist in any particularly simple form. However, for scalar-valued functions, it is possible to prove the following:
PROPOSITION 4.2 Let f,g:X⊆Rn →Rbe differentiable ata∈X. Then 1. The product function f gis also differentiable ata, and
D(f g)(a)=g(a)D f(a)+ f(a)Dg(a).
2. Ifg(a)=0, then the quotient function f/gis differentiable ata, and D(f/g)(a)= g(a)D f(a)− f(a)Dg(a)
g(a)2 .
EXAMPLE 2 If f(x,y,z)=zex yandg(x,y,z)=x y+2yz−x z, then (f g)(x,y,z)=(x yz+2yz2−x z2)ex y,
so that
D(f g)(x,y,z)=
⎡
⎢⎣
(yz−z2)ex y+(x yz+2yz2−x z2)yex y (x z+2z2)ex y+(x yz+2yz2−x z2)xex y
(x y+4yz−2x z)ex y
⎤
⎥⎦
T
.
Also, we have
D f(x,y,z)=
yzex y x zex y ex y and
Dg(x,y,z)=
y−z x+2z 2y−x , so that
g(x,y,z)D f(x,y,z)+ f(x,y,z)Dg(x,y,z)
=
⎡
⎢⎣
(x y2z+2y2z2−x yz2)ex y (x2yz+2x yz2−x2z2)ex y
(x y+2yz−x z)ex y
⎤
⎥⎦
T
+
⎡
⎢⎣
(yz−z2)ex y (x z+2z2)ex y (2yz−x z)ex y
⎤
⎥⎦
T
=ex y
⎡
⎢⎣
x y2z+2y2z2−x yz2+yz−z2 x2yz+2x yz2−x2z2+x z+2z2
x y+4yz−2x z
⎤
⎥⎦
T
,
which checks with part 1 of Proposition 4.2. (Note: The matrix transpose is used
simply to conserve space on the page.) ◆
The product rule in part 1 of Proposition 4.2 is not the most general result possible. Indeed, if f:X ⊆Rn →R is a scalar-valued function and g:X⊆Rn →Rm is a vector-valued function, then if f andg are both differ- entiable ata∈X, so is fg, and the following formula holds (where we viewg(a) as anm×1 matrix):
D(fg)(a)=g(a)D f(a)+ f(a)Dg(a). Partial Derivatives of Higher Order
Thus far in our study of differentiation, we have been concerned only with partial derivatives of first order. Nonetheless, it is easy to imagine computing second- and third-order partials by iterating the process of differentiating with respect to one variable, while all others are held constant.
EXAMPLE 3 Let f(x,y,z)=x2y+y2z. Then the first-order partial deriva- tives are
∂f
∂x =2x y, ∂f
∂y =x2+2yz, and ∂f
∂z = y2.
Thesecond-order partial derivativewith respect tox, denoted by∂2f/∂x2 or fx x(x,y,z), is
∂2f
∂x2 = ∂
∂x ∂f
∂x
= ∂
∂x(2x y)=2y.
Similarly, the second-order partials with respect toyandzare, respectively,
∂2f
∂y2 = ∂
∂y ∂f
∂y
= ∂
∂y(x2+2yz)=2z, and
∂2f
∂z2 = ∂
∂z ∂f
∂z
= ∂
∂z(y2)≡0.
There are more second-order partials, however. The mixed partial derivative with respect to firstxand theny, denoted∂2f/∂y∂xor fx y(x,y,z), is
∂2f
∂y∂x = ∂
∂y ∂f
∂x
= ∂
∂y(2x y)=2x.
There are five more mixed partials for this particular function: ∂2f/∂x∂y,
∂2f/∂z∂x,∂2f/∂x∂z,∂2f/∂z∂y, and∂2f/∂y∂z. Compute each of them to get
a feeling for the process. ◆
In general, if f:X ⊆Rn →Ris a (scalar-valued) function ofnvariables, the kth-order partial derivativewith respect to the variablesxi1,xi2, . . . ,xik (in that
order), wherei1,i2, . . . ,ikare integers in the set{1,2, . . . ,n}(possibly repeated), is the iterated derivative
∂kf
∂xik· · ·∂xi2∂xi1 = ∂
∂xik · · · ∂
∂xi2
∂
∂xi1(f(x1,x2, . . . ,xn)).
Equivalent (and frequently more manageable) notation for thiskth-order partial is
fxi1xi2···xik(x1,x2, . . . ,xn).
Note that the order in which we write the variables with respect to which we differentiate is different in the two notations: In the subscript notation, we write the differentiation variables from left to right in the order we differentiate, while in the∂-notation, we write those variables in theoppositeorder (i.e., from right to left).
EXAMPLE 4 Let f(x,y,z, w)=x yz+x y2w−cos(x+zw). We then have fyw(x,y,z, w)= ∂2f
∂w∂y = ∂
∂w
∂
∂y(x yz+x y2w−cos(x+zw))
= ∂
∂w(x z+2x yw)=2x y, and
fwy(x,y,z, w)= ∂2f
∂y∂w = ∂
∂y
∂
∂w(x yz+x y2w−cos(x+zw))
= ∂
∂y(x y2+zsin(x+zw))=2x y. ◆ Although it is generally ill-advised to formulate conjectures based on a single piece of evidence, Example 4 suggests that there might be an outrageously simple relationship among the mixed second partials. Indeed, such is the case, as the next result, due to the 18th-century French mathematician Alexis Clairaut, indicates.
THEOREM 4.3 Suppose that X is open inRn and f:X⊆Rn →R has con- tinuous first- and second-order partial derivatives. Then the order in which we evaluate the mixed second-order partials is immaterial; that is, ifi1andi2are any two integers between 1 andn, then
∂2f
∂xi1∂xi2
= ∂2f
∂xi2∂xi1
.
A proof of Theorem 4.3 is provided in the addendum to this section. We also suggest a second proof (using integrals!) in Exercise 4 of the Miscellaneous Exercises for Chapter 5.
It is natural to speculate about the possibility of an analogue to Theorem 4.3 forkth-order mixed partials. Before we state what should be an easily anticipated result, we need some terminology.
DEFINITION 4.4 Assume X is open in Rn. A scalar-valued function f:X ⊆Rn →R whose partial derivatives up to (and including) order at leastk exist and are continuous on X is said to beof class Ck. If f has continuous partial derivatives of all orders onX, then f is said to beof class C∞, orsmooth. A vector-valued functionf:X ⊆Rn →Rm is of classCk (respectively, of classC∞) if and only if each of its component functions is of classCk(respectively,C∞).
THEOREM 4.5 Let f:X⊆Rn →Rbe a scalar-valued function of class Ck. Then the order in which we calculate anykth-order partial derivative does not matter: If (i1, . . . ,ik) are anyk integers (not necessarily distinct) between 1 and n, and if (j1, . . . ,jk) is any permutation (rearrangement) of these integers, then
∂kf
∂xi1· · ·∂xik = ∂kf
∂xj1· · ·∂xjk.
EXAMPLE 5 If f(x,y,z, w)=x2weyz−zexw+x yzw, then you can check that
∂5f
∂x∂w∂z∂y∂x =2eyz(yz+1)= ∂5f
∂z∂y∂w∂2x,
verifying Theorem 4.5 in this case. ◆
Addendum: Two Technical Proofs Proof of Part 1 of Proposition 4.1
Step 1. We show that the matrix of partial derivatives ofhis the sum of those offandg. If we writeh(x) as (h1(x),h2(x), . . . ,hm(x)) (i.e., in terms of its component functions), then thei jth entry ofDh(a) is∂hi/∂xjevaluated ata.
Buthi(x)= fi(x)+gi(x) by definition ofh. Hence,
∂hi
∂xj = ∂
∂xj
(fi(x)+gi(x))= ∂fi
∂xj + ∂gi
∂xj,
by properties of ordinary differentiation (since all variables exceptxj are held constant). Thus,
∂hi
∂xj
(a)= ∂fi
∂xj
(a)+ ∂gi
∂xj
(a), and, therefore,
Dh(a)= Df(a)+Dg(a).
Step 2. Now that we know the desired matrix of partials exists, we must show thathreally is differentiable; that is, we must establish that
x→alim
h(x)−[h(a)+Dh(a)(x−a)]
x−a =0.
As preliminary background, we note that h(x)−[h(a)+Dh(a)(x−a)]
x−a
= f(x)+g(x)−[f(a)+g(a)+Df(a)(x−a)+Dg(a)(x−a)]
x−a
= (f(x)−[f(a)+Df(a)(x−a)])+(g(x)−[g(a)+Dg(a)(x−a)]) x−a
≤ f(x)−[f(a)+Df(a)(x−a)]
x−a + g(x)−[g(a)+Dg(a)(x−a)]
x−a , by the triangle inequality, formula (2) of §1.6. To show that the desired limit equation forhfollows from the definition of the limit, we must show that given any >0, we can find a numberδ >0 such that
if 0<x−a< δ, then h(x)−[h(a)+Dh(a)(x−a)]
x−a < . (1)
Sincefis given to be differentiable ata, this means that given any1>0, we can findδ1 >0 such that
if 0<x−a< δ1, then f(x)−[f(a)+Df(a)(x−a)]
x−a < 1. (2)
Similarly, differentiability ofgmeans that given any2 >0, we can find aδ2>0 such that
if 0<x−a< δ2, then g(x)−[g(a)+Dg(a)(x−a)]
x−a < 2. (3)
Now we’re ready to establish statement (1). Suppose >0 is given. Letδ1
andδ2be such that (2) and (3) hold with1=2 =/2. Takeδto be the smaller ofδ1 andδ2. Hence, if 0<x−a< δ, then both statements (2) and (3) hold (with1=2=/2) and, moreover,
h(x)−[h(a)+Dh(a)(x−a)]
x−a ≤ f(x)−[f(a)+Df(a)(x−a)]
x−a
+g(x)−[g(a)+Dg(a)(x−a)]
x−a
< 1+2
= 2+
2 =.
That is, statement (1) holds, as desired. ■
Proof of Theorem 4.3 For simplicity of notation only, we’ll assume that f is a function of just two variables (xandy). Let the point (a,b)∈R2be in the interior of some rectangle on which fx, fy, fx x, fyy, fx y, and fyx are all continuous.
Consider the following “difference function.” (See Figure 2.55.) (a, b)
+
− (a, b +Δy)
+
− (a + Δx, b +Δy)
(a + Δx, b)
Figure 2.55 To construct the difference functionDused in the proof of Theorem 4.3, evaluate f at the four points shown with the signs as indicated.
D(x, y)= f(a+x,b+y)− f(a+x,b)
−f(a,b+y)+ f(a,b).
Our proof depends upon viewing this function in two ways. We first regardDas a difference of vertical differences in f:
D(x, y)=[f(a+x,b+y)− f(a+x,b)]
−[f(a,b+y)− f(a,b)]
= F(a+x)−F(a).
Here we define the one-variable functionF(x) to be f(x,b+y)− f(x,b). As we will see, the mixed second partial of f can be found from two applications of the mean value theorem of one-variable calculus. Since f has continuous partials, it is differentiable. (See Theorem 3.10.) Hence,Fis continuous and differentiable, and, thus, the mean value theorem implies that there is some numbercbetween aanda+xsuch that
D(x, y)= F(a+x)−F(a)= F(c)x. (4) NowF(c)= fx(c,b+y)− fx(c,b). We again apply the mean value theorem, this time to the function fx(c,y). (Here, we think ofcas constant and y as the variable.) By hypothesis fx is differentiable since its partial derivatives, fx x and fx y, are assumed to be continuous. Consequently, the mean value theorem applies to give us a numberdbetweenbandb+ysuch that
F(c)= fx(c,b+y)− fx(c,b)= fx y(c,d)y. (5) Using equation (5) in equation (4), we have
D(x, y)=F(c)x= fx y(c,d)yx.
(a, b) (c, d)
R
X (a, b + Δy) (a + Δx, b + Δy)
(a + Δx, b) Figure 2.56 Applying the mean value theorem twice.
The point (c,d) lies somewhere in the interior of the rectangle Rwith vertices (a,b), (a+x,b), (a,b+y), (a+x,b+y), as shown in Figure 2.56.
Thus, as (x, y)→(0,0), we have (c,d)→(a,b). Hence, it follows that fx y(c,d)→ fx y(a,b) as (x, y)→(0,0),
since fx yis assumed to be continuous. Therefore, fx y(a,b)= lim
(x,y)→(0,0) fx y(c,d)= lim
(x,y)→(0,0)
D(x, y)
yx .
On the other hand, we could just as well have written D as a difference of horizontal differences in f:
D(x, y)=[f(a+x,b+y)− f(a,b+y)]
−[f(a+x,b)− f(a,b)]
=G(b+y)−G(b).
HereG(y)= f(a+x,y)− f(a,y). As before, we can apply the mean value theorem twice to find that there must be another point ( ¯c,d) in¯ Rsuch that
D(x, y)=G( ¯d)y= fyx( ¯c,d)¯ xy. Therefore,
fyx(a,b)= lim
(x,y)→(0,0) fyx( ¯c,d)¯ = lim
(x,y)→(0,0)
D(x, y)
xy .
Because this is the same limit as that for fx y(a,b) just given, we have established
the desired result. ■