• Tidak ada hasil yang ditemukan

The Derivative

DEFINITION 2.1 I NTUITIVE DEFINITION OF LIMIT ) The equation

2.3 The Derivative

Our goal for this section is to define the derivative of a functionf:XRnRm, where n and m are arbitrary positive integers. Predictably, the derivative of a vector-valued function of several variables is a more complicated object than the derivative of a scalar-valued function of a single variable. In addition, the notion of differentiability is quite subtle in the case of a function of more than one variable.

We first define the basic computational tool of partial derivatives. After do- ing so, we can begin to understand differentiability via the geometry of tangent planes to surfaces. Finally, we generalize these relatively concrete ideas to higher dimensions.

Partial Derivatives

Recall that ifF:XRRis a scalar-valued function of one variable, then the derivativeofFat a numberaXis

F(a)= lim

h→0

F(a+h)−F(a)

h . (1)

Moreover,Fis said to bedifferentiable ataprecisely when the limit in equation (1) exists.

DEFINITION 3.1 Suppose f:XRnRis a scalar-valued function ofn variables. Letx=(x1,x2, . . . ,xn) denote a point ofRn.Apartial function Fwith respect to the variable xi is a one-variable function obtained from f by holding all variables constant exceptxi. That is, we setxj equal to a constantaj for j =i. Then the partial function inxiis defined by

F(xi)= f(a1,a2, . . . ,xi, . . . ,an).

EXAMPLE 1 If f(x,y)=(x2y2)/(x2+y2), then the partial functions with respect toxare given by

F(x)= f(x,a2)= x2a22 x2+a22,

wherea2may be any constant. If, for example,a2=0, then the partial function is F(x)= f(x,0)= x2

x2 ≡1.

Geometrically, this partial function is nothing more than the restriction of f to the horizontal liney=0. Note that since the origin is not in the domain of f, 0 should not be taken to be in the domain ofF. (See Figure 2.46.)

Domain of F (restriction of f) Domain of f

x y

Figure 2.46 The function f of Example 1 is defined onR2{(0,0)}, while its partial functionF alongy=0 is defined on the x-axis minus the origin.

REMARK In practice, we usually do not go to the notational trouble of explic- itly replacing thexj’s (j =i) by constants when working with partial functions.

Instead, we make a mental note that the partial function is obtained by allowing only one variable to vary, while all the other variables are held fixed.

DEFINITION 3.2 The partial derivative of f with respect to xi is the (ordinary) derivative of the partial function with respect toxi. That is, the partial derivative with respect to xi is F(xi), in the notation of Definition 3.1. Standard notations for the partial derivative of f with respect toxiare

∂f

∂xi, Dxi f(x1, . . . ,xn), and fxi(x1, . . . ,xn). Symbolically, we have

∂f

∂xi = lim

h0

f(x1, . . . ,xi+h, . . . ,xn)− f(x1, . . . ,xn)

h . (2)

y y = b

x

z

(a, b, 0) (a, b, f(a, b))

Figure 2.47 Visualizing the partial derivative∂fx(a,b).

By definition, the partial derivative is the (instantaneous) rate of change of f when all variables, except the specified one, are held fixed. In the case where f is a (scalar-valued) function of two variables, we can understand

∂f

∂x(a,b)

geometrically as the slope at the point (a,b, f(a,b)) of the curve obtained by intersecting the surfacez= f(x,y) with the planey =b, as shown in Figure 2.47.

Similarly,

∂f

∂y(a,b)

is the slope at (a,b, f(a,b)) of the curve formed by the intersection ofz= f(x,y) andx =a, shown in Figure 2.48.

y x = a

x

z

(a, b, 0) (a, b, f(a, b))

Figure 2.48 Visualizing the partial derivativeyf(a,b).

EXAMPLE 2 For the most part, partial derivatives are quite easy to compute, once you become adept at treating variables like constants. If

f(x,y)=x2y+cos(x+y), then we have

∂f

∂x =2x y−sin(x+y).

(Imagineyto be a constant throughout the differentiation process.) Also

∂f

∂y =x2−sin(x+y).

(Imaginexto be a constant.) Similarly, ifg(x,y)=x y/(x2+y2), then, from the quotient rule of ordinary calculus, we have

gx(x,y)= (x2+y2)yx y(2x)

(x2+y2)2 = y(y2x2) (x2+y2)2, and

gy(x,y)= (x2+y2)xx y(2y)

(x2+y2)2 = x(x2y2) (x2+y2)2.

Note that, of course, neithergnor its partial derivatives are defined at (0, 0). EXAMPLE 3 Occasionally, it is necessary to appeal explicitly to limits to eval- uate partial derivatives. Suppose f:R2Ris defined by

f(x,y)=

⎧⎪

⎪⎩

3x2yy3

x2+y2 if (x,y)=(0,0) 0 if (x,y)=(0,0) .

Then, for (x,y)=(0,0), we have

∂f

∂x = 8x y3

(x2+y2)2 and ∂f

∂y = 3x4−6x2y2y4 (x2+y2)2 . But what should ∂f

∂x(0,0) and ∂f

∂y(0,0) be? To find out, we return to Definition 3.2 of the partial derivatives:

∂f

∂x(0,0)= lim

h→0

f(0+h,0)− f(0,0)

h = lim

h→0

0−0 h =0, and

∂f

∂y(0,0)= lim

h→0

f(0,0+h)− f(0,0)

h = lim

h→0

h−0 h = lim

h→0−1= −1. Tangency and Differentiability

If F:XRR is a scalar-valued function of one variable, then to have F differentiable at a numberaX means precisely that the graph of the curve y= F(x) has a tangent line at the point (a,F(a)). (See Figure 2.49.) Moreover, this tangent line is given by the equation

y= F(a)+F(a)(xa). (3)

If we define the functionH(x) to beF(a)+F(a)(xa) (i.e., H(x) is the right side of equation (3) that gives the equation for the tangent line), thenH has two properties:

1. H(a)= F(a) 2. H(a)=F(a).

In other words, the line defined byy= H(x) passes through the point (a,F(a)) and has the same slope at (a,F(a)) as the curve defined by y= F(x). (Hence, the term “tangent line.”)

x y

(a, F(a))

Figure 2.49 The tangent line to y=F(x) atx=ahas equation y=F(a)+F(a)(xa).

Now suppose f:XR2Ris a scalar-valued function of two variables, whereXis open inR2. Then the graph of f is a surface. What should thetangent planeto the graph ofz = f(x,y) at the point (a,b, f(a,b)) be? Geometrically,

the situation is as depicted in Figure 2.50. From our earlier observations, we know that the partial derivative fx(a,b) is the slope of the line tangent at the point (a,b, f(a,b)) to the curve obtained by intersecting the surfacez= f(x,y) with the planey=b. (See Figure 2.51.) This means that if we travel along this tangent line, then for every unit change in the positivex-direction, there’s a change of fx(a,b) units in thez-direction. Hence, by using formula (1) of§1.2, the tangent line is given in vector parametric form as

y x

z

(a, b, f(a, b))

Figure 2.50 The plane tangent toz= f(x,y) at

(a,b,f(a,b)).

y = b x = a

y x

z

Figure 2.51 The tangent plane at (a,b,f(a,b)) contains the lines tangent to the curves formed by intersecting the surface

z= f(x,y) by the planesx=a andy=b.

l1(t)=(a,b, f(a,b))+t(1,0, fx(a,b)). Thus, a vector parallel to this tangent line is

u=i+ fx(a,b)k.

Similarly, the partial derivative fy(a,b) is the slope of the line tangent at the point (a,b, f(a,b)) to the curve obtained by intersecting the surfacez= f(x,y) with the planex=a. (Again see Figure 2.51.) Consequently, the tangent line is given by

l2(t)=(a,b, f(a,b))+t(0,1, fy(a,b)), so a vector parallel to this tangent line is

v=j+ fy(a,b)k.

Both of the aforementioned tangent lines must be contained in the plane tangent toz= f(x,y) at (a,b, f(a,b)), if one exists. Hence, a vector nnormal to the tangent plane must be perpendicular to bothuandv. Therefore, we may taken to be

n=u×v= −fx(a,b)ify(a,b)j+k.

Now, use equation (1) of§1.5 to find that the equation for the tangent plane—that is, the plane through (a,b, f(a,b)) with normaln—is

(−fx(a,b),fy(a,b),1)·(xa,yb,zf(a,b))=0 or, equivalently,

fx(a,b)(xa)− fy(a,b)(yb)+zf(a,b)=0. By rewriting this last equation, we have shown the following result:

THEOREM 3.3 If the graph of z = f(x,y) has a tangent plane at (a,b, f(a,b)), then that tangent plane has equation

z= f(a,b)+ fx(a,b)(xa)+ fy(a,b)(yb). (4) Note that if we define the functionh(x,y) to be equal to f(a,b)+ fx(a,b)(xa)+ fy(a,b)(yb) (i.e.,h(x,y) is the right side of equation (4)), thenhhas the following properties:

1. h(a,b)= f(a,b) 2. ∂h

∂x(a,b)= ∂f

∂x(a,b) and ∂h

∂y(a,b)= ∂f

∂y(a,b).

In other words,hand its partial derivatives agree with those of f at (a,b).

It is tempting to think that the surface z= f(x,y) has a tangent plane at (a,b, f(a,b)) as long as you can make sense of equation (4), that is, as long as the

partial derivatives fx(a,b) and fy(a,b) exist. Indeed, this would be analogous to the one-variable situation where the existence of the derivative and the existence of the tangent line mean exactly the same thing. However, it is possible for a function of two variables to have well-defined partial derivatives (so that equation (4) makes sense) yetnothave a tangent plane.

EXAMPLE 4 Let f(x,y)= ||x| − |y|| − |x| − |y| and consider the surface defined by the graph ofz= f(x,y) shown in Figure 2.52. The partial derivatives of f at the origin may be calculated from Definition 3.2 as

y x

z

Figure 2.52 If two points approach (0,0,0) while remaining on one face of the surface described in Example 4, the limiting plane they and (0,0,0) determine is different from the one determined by letting the two points approach (0,0,0) while remaining on another face.

fx(0,0)= lim

h0

f(0+h,0)− f(0,0)

h = lim

h0

||h|| − |h|

h = lim

h00=0 and

fy(0,0)= lim

h→0

f(0,0+h)− f(0,0)

h = lim

h→0

|−|h|| − |h|

h = lim

h→00=0. (Indeed, the partial functionsF(x)= f(x,0) andG(y)= f(0,y) are both identi- cally zero and, thus, have zero derivatives.) Consequently,ifthe surface in question has a tangent plane at the origin, then equation (4) tells us that it has equation z=0. But there is no geometric sense in which the surfacez= f(x,y) has a tangent plane at the origin. If we think of a tangent plane as the geometric limit of planes that pass through the point of tangency and two other “moving” points on the surface as those two points approach the point of tangency, then Figure 2.52 shows that there is no uniquely determined limiting plane. Example 4 shows that the existence of a tangent plane to the graph of z= f(x,y) is a stronger condition than the existence of partial derivatives. It turns out that such a stronger condition is more useful in that theorems from the calculus of functions of a single variable carry over to the context of functions of several variables. What we must do now is find a suitable analytic definition of differentiability that captures this idea. We begin by looking at the definition of the one-variable derivative with fresh eyes.

By replacing the quantitya+hby the variablex, the limit equation in formula (1) may be rewritten as

F(a)= lim

xa

F(x)−F(a) xa . This is equivalent to the equation

xlima

F(x)−F(a) xa

F(a)=0.

The quantityF(a) does not depend onxand therefore may be brought inside the limit. We thus obtain the equation

xlima

F(x)−F(a)

xaF(a)

=0.

Finally, some easy algebra enables us to conclude that the functionFis differen- tiable ataif there is a numberF(a) such that

xlima

F(x)−[F(a)+F(a)(xa)]

xa =0. (5)

What have we learned from writing equation (5)? Note that the expression in brackets in the numerator of the limit expression in equation (5) is the function

H(x) that was used to define the tangent line toy =F(x) at (a,F(a)). Thus, we may rewrite equation (5) as

xalim

F(x)−H(x) xa =0.

For the limit above to be zero, we certainly must have that the limit of the numerator is zero. But since the limit of the denominator is also zero, we can say even more, namely, that the difference between they-values of the graph ofFand of its tangent line must approach zero faster thanx approachesa. This is what is meant when we say that “H is a good linear approximation to Fneara.” (See Figure 2.53.) Geometrically, it means that, near the point of tangency, the graph ofy= F(x) is approximately straight like the graph ofy =H(x).

a x

(x, H(x)) (x, F(x)) (a, F(a))

Figure 2.53 IfFis differentiable ata, the vertical distance between F(x) andH(x) must approach zero faster than the horizontal distance betweenxandadoes.

If we now pass to the case of a scalar-valued function f(x,y) of two variables, then to say thatz= f(x,y) has a tangent plane at (a,b, f(a,b)) (i.e., that f is differentiable at (a,b)) should mean that the vertical distance between the graph of f and the “candidate” tangent plane given by

z=h(x,y)= f(a,b)+ fx(a,b)(xa)+ fy(a,b)(yb)

must approach zero faster than the point (x,y) approaches (a,b). (See Fig- ure 2.54.) In other words, near the point of tangency, the graph ofz= f(x,y) is approximately flat just like the graph ofz=h(x,y). We can capture this geometric idea with the following formal definition of differentiability:

DEFINITION 3.4 Let X be open inR2 and f:XR2Rbe a scalar- valued function of two variables. We say that f isdifferentiable at(a,b)∈ X if the partial derivatives fx(a,b) and fy(a,b) exist and if the function

h(x,y)= f(a,b)+ fx(a,b)(xa)+ fy(a,b)(yb) is a good linear approximation to f near (a,b)—that is, if

(x,y)→(a,b)lim

f(x,y)−h(x,y) (x,y)−(a,b) =0.

Moreover, if f is differentiable at (a,b), then the equation z=h(x,y) de- fines thetangent planeto the graph of f at the point (a,b, f(a,b)). If f is differentiable at all points of its domain, then we simply say that f is differentiable.

y x

z

(x, y, f(x, y)) (x, y, h(x, y)) (a, b, f(a, b))

Figure 2.54 If f is differentiable at (a,b), the distance between f(x,y) andh(x,y) must approach zero faster than the distance between (x,y) and (a,b) does.

EXAMPLE 5 Let us return to the function f(x,y)= ||x| − |y|| − |x| − |y|of Example 4. We already know that the partial derivatives fx(0,0) and fy(0,0) exist and equal zero. Thus, the functionhof Definition 3.4 is the zero function.

Consequently, f will be differentiable at (0,0) just in case

(x,y)→(0,0)lim

f(x,y)−h(x,y)

(x,y)−(0,0) = lim

(x,y)→(0,0)

f(x,y) (x,y)

= lim

(x,y)→(0,0)

||x| − |y|| − |x| − |y| x2+y2

is zero. However, it is not hard to see that the limit in question fails to exist. Along the liney=0, we have

f(x,y)

(x,y) = ||x| −0| − |x| − |0|

x2 = 0

|x| =0, but along the liney=x, we have

f(x,y)

(x,y) = ||x| − |x|| − |x| − |x|

x2+x2 = −2|x|

√2|x| = −√ 2.

Hence, f fails to be differentiable at (0, 0) and has no tangent plane at (0, 0, 0). The limit condition in Definition 3.4 can be difficult to apply in practice.

Fortunately, the following result, which we will not prove, simplifies matters in many instances. Recall from Definition 2.3 that the phrase “aneighborhoodof a pointPin a set X” just means an open set containingPand contained inX.

THEOREM 3.5 Suppose X is open inR2. If f:XRhas continuous partial derivatives in a neighborhood of (a,b) inX, then f is differentiable at (a,b).

A proof of a more general result (Theorem 3.10) is provided in the addendum to this section.

EXAMPLE 6 Let f(x,y)=x2+2y2. Then ∂f/∂x =2x and ∂f/∂y=4y, both of which are continuous functions on all ofR2. Thus, Theorem 3.5 implies that f is differentiable everywhere. The surface z=x2+2y2 must therefore have a tangent plane at every point. At the point (2,−1), for example, this tangent plane is given by the equation

z=6+4(x−2)−4(y+1)

(or, equivalently, by 4x−4yz =6).

While we’re on the subject of continuity and differentiability, the next result is the multivariable analogue of a familiar theorem about functions of one variable.

THEOREM 3.6 If f:XR2Ris differentiable at (a,b), then it is continu- ous at (a,b).

EXAMPLE 7 Let the function f:R2Rbe defined by

f(x,y)=

⎧⎪

⎪⎩ x2y2

x4+y4 if (x,y)=(0,0) 0 if (x,y)=(0,0) .

The function f is not continuous at the origin, since lim(x,y)→(0,0) f(x,y) does not exist. (However, f is continuous everywhere else inR2.) By Theorem 3.6, f therefore cannot be differentiable at the origin. Nonetheless, the partial derivatives of f do exist at the origin, and we have

f(x,0)= 0

x4+0 ≡0 =⇒ ∂f

∂x(0,0)=0, and

f(0,y)= 0

0+y4 ≡0 =⇒ ∂f

∂y(0,0)=0,

since the partial functions are constant. Thus, we see that if we want something like Theorem 3.6 to be true, the existence of partial derivatives alone is not

enough.

Differentiability in General

It is not difficult now to see how to generalize Definition 3.4 to three (or more) variables: For a scalar-valued function of three variables to be differentiable at a point (a,b,c), we must have that (i) the three partial derivatives exist at (a,b,c) and (ii) the functionh:R3Rdefined by

h(x,y,z)= f(a,b,c)+ fx(a,b,c)(xa)

+ fy(a,b,c)(yb)+ fz(a,b,c)(zc) is a good linear approximation to f near (a,b,c). In other words, (ii) means that

(x,y,z)→(a,b,c)lim

f(x,y,z)−h(x,y,z) (x,y,z)−(a,b,c) =0.

The passage from three variables to arbitrarily many is now straightforward.

DEFINITION 3.7 Let X be open inRn and f:XR be a scalar-valued function; let a=(a1,a2, . . . ,an)∈ X. We say that f is differentiable at aif all the partial derivatives fxi(a),i =1, . . . ,n, exist and if the function h:RnRdefined by

h(x)= f(a)+ fx1(a)(x1a1)+ fx2(a)(x2a2)

+ · · · + fxn(a)(xnan) (6) is a good linear approximation to f neara, meaning that

xalim

f(x)−h(x) xa =0.

We can use vector and matrix notation to rewrite things a bit. Define the gradientof a scalar-valued function f:XRnRto be thevector

f(x)= ∂f

∂x1, ∂f

∂x2, . . . , ∂f

∂xn

. Consequently,

f(a)=(fx1(a), fx2(a), . . . , fxn(a)).

Alternatively, we can use matrix notation and define thederivativeof f at a, denotedD f(a), to be the row matrix whose entries are the components of∇f(a);

that is,

D f(a)=

fx1(a) fx2(a) · · · fxn(a) .

Then, by identifying the vectorxawith then×1 column matrix whose entries are the components ofxa, we have

f(a)·(xa)= D f(a)(xa)=

fx1(a) fx2(a) · · · fxn(a)

⎢⎢

x1a1 x2a2

... xnan

⎥⎥

= fx1(a)(x1a1)+ fx2(a)(x2a2) + · · · + fxn(a)(xnan).

Hence, vector notation allows us to rewrite equation (6) quite compactly as h(x)= f(a)+ ∇f(a)·(xa).

Thus, to say thath is a good linear approximation to f nearain equation (6) means that

xalim

f(x)−[f(a)+ ∇f(a)·(xa)]

xa =0. (7) Compare equation (7) with equation (5). Differentiability of functions of one and several variables should really look very much the same to you. It is worth noting that the analogues of Theorems 3.5 and 3.6 hold in the case ofnvariables.

The gradient of a function is an extremely important construction, and we consider it in greater detail in§2.6.

You may be wondering what, if any, geometry is embedded in this general notion of differentiability. Recall that the graph of the function f:XRnR is the hypersurface in Rn+1 given by the equation xn+1 = f(x1,x2, . . . ,xn).

(See equation (2) of§2.1.) If f is differentiable ata, then the hypersurface deter- mined by the graph has atangent hyperplaneat (a, f(a)) given by the equation

xn+1 =h(x1,x2, . . . ,xn)= f(a)+ ∇f(a)·(xa)

= f(a)+D f(a)(xa). (8) Compare equation (8) with equation (3) for the tangent line to the curvey= F(x) at (a,F(a)). Although we cannot visualize the graph of a function of more than two variables, nonetheless, we can use vector notation to lend real meaning to tangency inndimensions.

EXAMPLE 8 Before we drown in a sea of abstraction and generalization, let’s do some concrete computation. An example of an “n-dimensional paraboloid” in

Rn+1is given by the equation

xn+1 =x12+x22+ · · · +xn2,

that is, by the graph of the function f(x1, . . . ,xn)=x12+x22+ · · · +xn2. We have

∂f

∂xi

=2xi, i =1,2, . . . ,n, so that

f(x1, . . . ,xn)=(2x1,2x2, . . . ,2xn).

Note that the partial derivatives of f are continuous everywhere. Hence, the n-dimensional version of Theorem 3.5 tells us that f is differentiable everywhere.

In particular, f is differentiable at the point (1,2, . . . ,n),

f(1,2, . . . ,n)=(2,4, . . . ,2n), and

D f(1,2, . . . ,n)=

2 4 · · · 2n . Thus, the paraboloid has a tangent hyperplane at the point

(1,2, . . . ,n,12+22+ · · · +n2) whose equation is given by equation (8):

xn+1 =(12+22+ · · · +n2)+

2 4 · · · 2n

⎢⎢

x1−1 x2−2

... xnn

⎥⎥

=(12+22+ · · · +n2)+2(x1−1)+4(x2−2)+ · · · +2n(xnn)

=(12+22+ · · · +n2)+2x1+4x2+ · · · +2nxn

−(2·1+4·2+ · · · +2n·n)

=2x1+4x2+ · · · +2nxn−(12+22+ · · · +n2)

= n i=1

2i xin(n+1)(2n+1)

6 .

(The formula 12+22+ · · · +n2=n(n+1)(2n+1)/6 is a well-known identity, encountered when you first learned about the definite integral. It’s straightforward

to prove using mathematical induction.)

At last we’re ready to take a look at differentiability in the most general setting of all. Let Xbe open inRn and letf:XRm be a vector-valued function ofn variables. We define thematrix of partial derivatives off, denoted Df, to be

them×nmatrix whosei jth entry is∂fi/∂xj, where fi:XRnRis theith component function off. That is,

Df(x1,x2, . . . ,xn)=

⎢⎢

⎢⎢

⎢⎢

⎢⎢

⎢⎢

⎢⎢

⎢⎣

∂f1

∂x1

∂f1

∂x2 · · · ∂f1

∂xn

∂f2

∂x1

∂f2

∂x2 · · · ∂f2

∂xn ... ... . .. ...

∂fm

∂x1

∂fm

∂x2 · · · ∂fm

∂xn

⎥⎥

⎥⎥

⎥⎥

⎥⎥

⎥⎥

⎥⎥

⎥⎦ .

Theith row ofDfis nothing more thanD fi—and the entries ofD fiare precisely the components of the gradient vector∇fi. (Indeed, in the case where m =1,

f andD f mean exactly the same thing.)

EXAMPLE 9 Suppose f:R3R2 is given by f(x,y,z)=(xcosy+z,x y).

Then we have

Df(x,y,z)=

cosyxsiny 1

y x 0

.

We generalize equation (7) and Definition 3.7 in an obvious way to make the following definition:

DEFINITION 3.8 (GRAND DEFINITION OF DIFFERENTIABILITY) Let XRn be open, letf:XRm, and letaX. We say thatfisdifferentiable at aifDf(a) exists and if the functionh:RnRmdefined by

h(x)=f(a)+Df(a)(xa)

is a good linear approximation tofneara. That is, we must have

xalim

f(x)−h(x) xa = lim

xa

f(x)−[f(a)+Df(a)(xa)]

xa =0.

Some remarks are in order. First, the reason for having the vector length appearing in the numerator in the limit equation in Definition 3.8 is so that there is a quotient of real numbers of which we can take a limit. (Definition 3.7 concerns scalar-valued functions only, so there is automatically a quotient of real numbers.) Second, the termDf(a)(xa) in the definition ofhshould be interpreted as the product of them×nmatrix Df(a) and then×1 column matrix

⎢⎢

x1a1 x2a2

... xnan

⎥⎥

.

Because of the consistency of our definitions, the following results should not surprise you:

THEOREM 3.9 Iff:XRnRm is differentiable ata, then it is continuous ata.

THEOREM 3.10 If f:XRnRm is such that, for i =1, . . . ,m and j =1, . . . ,n, all∂fi/∂xj exist and are continuous in a neighborhood of a in X, thenfis differentiable ata.

THEOREM 3.11 A functionf:XRnRmis differentiable ataX(in the sense of Definition 3.8) if and only if each of its component functions fi:XRnR,i =1, . . . ,m, is differentiable ata(in the sense of Definition 3.7).

The proofs of Theorems 3.9, 3.10, and 3.11 are provided in the addendum to this section. Note that Theorems 3.10 and 3.11 frequently make it a straight- forward matter to check that a function is differentiable: Just look at the partial derivatives of the component functions and verify that they are continuous. Thus, in many—but not all—circumstances, we can avoid working directly with the limit in Definition 3.8.

EXAMPLE 10 The functiong:R3− {(0,0,0)} →R3given by g(x,y,z)=

3

x2+y2+z2,x y,x z

has

Dg(x,y,z)=

⎢⎢

⎢⎢

−6x (x2+y2+z2)2

−6y (x2+y2+z2)2

−6z (x2+y2+z2)2

y x 0

z 0 x

⎥⎥

⎥⎥

.

Each of the entries of this matrix is continuous overR3− {(0,0,0)}. Hence, by Theorem 3.10,gis differentiable over its entire domain.

What Is a Derivative?

Although we have defined quite carefully what it means for a function to be differentiable, the derivative itself has really taken a “backseat” in the preceding discussion. It is time to get some perspective on the concept of the derivative.

In the case of a (differentiable) scalar-valued function of a single variable, f:XRR, the derivative f(a) is simply a real number, the slope of the tangent line to the graph of f at the point (a, f(a)). From a more sophisticated (and slightly less geometric) point of view, the derivative f(a) is the number such that the function

h(x)= f(a)+ f(a)(xa)

is a good linear approximation to f(x) forxneara. (And, of course,y=h(x) is the equation of the tangent line.)

If a function f:XRnR of n variables is differentiable, there must existnpartial derivatives∂f/∂x1, . . . , ∂f/∂xn. These partial derivatives form the components of the gradient vector∇f (or the entries of the 1×nmatrixD f). It