Diﬀerential Calculus for Functions of Several Variables

Diﬀerential Calculus for Functions of Several

104 Diﬀerential calculus for functions of several variables

• For the norms introduced above the inequalitiesx_∞≤ x₂≤ x₁are valid;x denotes an arbitrary norm, usually the Euclidian normx₂.

• A point x is called an interior point of the set M ⊂IRⁿ if there exists a neighbourhoodU_ε(x) contained inM. The set of all interior points of M is called the interior of M and denoted by int M. A point x is called an accumulation pointof the setM if every neighbourhoodU_ε(x) contains points ofM diﬀerent fromx.

• A setM is said to beopenif intM =M, it is calledclosedif it contains any of its accumulation points.

• A set M ⊂ IRⁿ is called bounded if there exists a number C such that x ≤C for allx∈M.

Limit and continuity Sequences of points

Apoint sequence {x_k} ⊂IRⁿ is a mapping from IN to IRⁿ. The components of the elementsx_k of the sequence are denoted byx_i⁽^k⁾,i= 1, . . . , n.

x= lim

k→∞xk ⇐⇒ lim

k→∞xk−x= 0 – convergence of the point sequence {xk} to the limit pointx

• A point sequence{x_k} converges to the limit point xif and only if any sequence{x_i⁽^k⁾},i= 1, . . . , n, converges to thei-th componentx_i ofx.

Continuity

A number a∈IR is called thelimit of the function f at the point x₀ if for any point sequence{xk} converging tox₀ such thatxk =x₀ andxk ∈D_f the relation lim

k→∞f(x_k) =ais true. Notation: lim x→x0

f(x) =a.

• A functionf is calledcontinuous at the pointx₀∈D_fif it has a limit atx₀ (i. e. if for any point sequence converging tox₀the sequence of corresponding function values converges to one and the same value) and this value is equal to the function value atx₀:

xlim→x0

f(x) =f(x₀) ⇐⇒ lim

k→∞f(x_k) =f(x₀) ∀ {x_k} with x_k→x₀

• Equivalent formulation:f iscontinuous at the pointx₀ if, for any number ε > 0, there exists a number δ >0 such that |f(x)−f(x₀)|< ε provided thatx−x₀< δ.

• If a function f is continuous for allx ∈D_f, then it is calledcontinuous onD_f.

Diﬀerentiation of functions of several variables 105

• If the functions f and g are continuous on their domains D_f and D_g, respectively, then the functionsf±g,f·g, andf

g are continuous onD_f∩D_g, the latter being continuous only for valuesxwithg(x)= 0.

Homogeneous functions

f(λx₁, . . . , λx_n) =λ^α·f(x₁, . . . , x_n) ∀λ≥0

– f homogeneous of degreeα≥0 f(x₁, . . . , λxi, . . . , xn) =λ^αⁱf(x₁, . . . , xn) ∀λ≥0

– f partially homogeneous of degreeα_i≥0 α= 1: linearly homogeneous

α >1: superlinearly homogeneous α <1: sublinearly homogeneous

• For linearly homogeneous functions a proportional increase of variables causes a proportional increase of the function value. This is the reason why these functions are also called CES (= constant elasticity of substitution) functions.

Diﬀerentiation of functions of several variables Notion of diﬀerentiability

The functionf :D_f →IR, D_f ⊂IRⁿ, is called (totally) diﬀerentiable at the pointx₀if there exists a vectorg(x₀)such that

lim

∆x_→0

f(x₀+∆x)−f(x₀)−g(x₀) ∆x

∆x = 0

• If such a vectorg(x₀)exists, then it is called thegradient and denoted by

∇f(x₀) or gradf(x₀). The functionf is said to bediﬀerentiable onD_f if it is diﬀerentiable at all pointsx∈D_f.

Partial derivatives

If forf : D_f → IR, D_f ⊂IRⁿ, at the point x₀ = (x⁰₁, . . . , x⁰_n) there exists the limit lim

∆xi→0

f(x⁰₁, . . . , x⁰_i₋₁, x⁰_i +∆x_i, x⁰_i₊₁, . . . , x⁰_n)−f(x⁰₁, . . . , x⁰_n)

∆x_i ,

then it is called the (ﬁrst-order) partial derivative of the function f with respect to the variablexi at the pointx₀ and is denoted by ∂f

∂x_i

x=x0

, ∂y

∂x_i, f_x_i(x₀), or∂_x_if.

106 Diﬀerential calculus for functions of several variables

• If the function f has partial derivatives with respect to all variables at every point x ∈ D_f, then f is called partially diﬀerentiable. In the case if all partial derivatives are continuous functions, f is said to be continuously partially diﬀerentiable.

• When calculating the partial derivatives, all variables to which we do not differentiate are considered as constant. Then the corresponding rules of differentiation for functions of one variable (especially the rules for the differentiation of a constant summand and a constant factor, p. 64, 65) are to be applied.

Gradient

If the functionf :D_f→IR,D_f ⊂IRⁿ, is continuously partially diﬀerentiable onD_f, then it is also totally diﬀerentiable there, where the gradient is the column vector formed from the partial derivatives:

∇f(x) =

∂f(x)

∂x₁ , . . . , ∂f(x)

∂x_n

– gradient of the function f at the pointx(also denoted by gradf(x))

• If the functionf is totally diﬀerentiable, then for thedirectional derivative

f(x;r) = lim

t↓0

f(x+tr)−f(x) t

(which exists in this case for arbitrary directionsr∈IRⁿ), the representation f(x;r) =∇f(x) r holds, and∇f(x) is the direction of steepest ascent of f at the pointx.

• The gradient∇f(x₀) is orthogonal to the level line off to the levelf(x₀), so that (forn= 2) the tangent to the level line or (forn >2) the tangential (hyper)plane to the set{x|f(x) =f(x₀)}at the point x₀ has the equation

∇f(x₀) (x−x₀) = 0. Directional derivatives in tangential direction to a level line (forn= 2) have the value zero, so that in linear approximation the function value is constant in these directions.

Chain rule

Let the functions u_k = g_k(x₁, . . . , x_n), k = 1, . . . , m of n variables as well as the function f of m variables be totally diﬀerentiable at the points x = (x₁, . . . , x_n) and u = (u₁, . . . , u_m) , respectively. Then the compos- ite function F(x₁, . . . , x_n) =f(g₁(x₁, . . . , x_n), . . . , g_m(x₁, . . . , x_n)) is totally diﬀerentiable at the pointx, where

Diﬀerentiation of functions of several variables 107

∇F(x) =G(x) ∇f(u) ⇐⇒

⎛

⎜⎝ Fx₁(x)

... Fxn(x)

⎞

⎟⎠=

⎛

⎜⎝

∂_x₁g₁(x) . . . ∂_x₁g_m(x) . . . .

∂_x_ng₁(x) . . . ∂_x_ng_m(x)

⎞

⎟⎠

⎛

⎜⎝ fu₁(u)

... fum(u)

⎞

⎟⎠

∂F(x)

∂xi

= m k=1

∂f

∂uk

(g(x))·∂g_k

∂xi

(x) – componentwise notation

Special case m=n= 2; function f(u, v) withu=u(x, y),v=v(x, y):

∂f

∂x = ∂f

∂u· ∂u

∂x+∂f

∂v ·∂v

∂x

∂f

∂y = ∂f

∂u ·∂u

∂y +∂f

∂v ·∂v

∂y

• The matrix G(x) is called thefunctional matrix or Jacobian matrix of the system of functions{g₁, . . . , gm}.

Higher partial derivatives

The partial derivatives are again functions and thus have (under suitable assumptions) partial derivatives.

∂²f(x)

∂xi∂xj

=f_x_i_x_j(x) = ∂

∂xj

∂f(x)

∂xi

– second-order partial derivatives

∂³f(x)

∂x_i∂x_j∂x_k =f_x_i_x_j_x_k(x) = ∂

∂x_k

∂²f(x)

∂x_i∂x_j

– third-order partial derivatives

Schwarz’s theorem (on commutativity of diﬀerentiation). If the partial derivativesfxixj andfxjxi are continuous in a neighbourhood of the pointx, then the following relations hold: fxixj(x) =fxjxi(x).

• Generalization: If the partial derivatives ofk-th order exist and are continuous, then the order of diﬀerentiation does not play any role when calculating the partial derivatives.

Hessian matrix

H_f(x) =

⎛

⎜⎜

⎝

f_x₁_x₁(x) f_x₁_x₂(x) . . . f_x₁_x_n(x) f_x₂_x₁(x) f_x₂_x₂(x) . . . f_x₂_x_n(x) . . . . f_x_n_x₁(x) f_x_n_x₂(x) . . . f_x_n_x_n(x)

⎞

⎟⎟

⎠ –

Hessian matrix of the twice partially diﬀer- entiable functionf at the pointx

• Under the assumptions of Schwarz’s theorem the Hessian matrix is sym- metric.

108 Diﬀerential calculus for functions of several variables

Total diﬀerential

If the function f :D_f →IR, D_f ⊂IRⁿ, is totally diﬀerentiable at the point x₀ ( p. 105), then the following relation holds:

∆f(x₀) =f(x₀+∆x)−f(x₀) =∇f(x₀) ∆x+o(∆x)

Here o(·) is Landau’s symbol with the property lim

∆x→0

o(∆x) ∆x = 0.

The total diﬀerential of the functionf at the pointx₀

∇f(x₀) ∆x= ∂f

∂x₁(x₀) dx₁+. . .+ ∂f

∂x_n(x₀) dx_n

describes the main increase of the function value if the increment of the n components of the independent variables is dx_i,i= 1, . . . , n(linear approximation); dx_i – diﬀerentials,∆x_i – (small) ﬁnite increments:

∆f(x)≈ⁿ

i=1

∂f

∂xi

(x)·∆x_i

Equation of the tangent plane

If the functionf :D_f →IR,D_f ⊂IRⁿ, is diﬀerentiable at the pointx₀, then its graph possesses atangent (hyper)plane at (x₀, f(x₀)) (linear approximation), which has the equation

∇f(x₀)

−1

x−x₀ y−f(x₀)

= 0 or y=f(x₀) +∇f(x₀) (x−x₀).

Partial elasticities

If the functionf :D_f →IR,D_f ⊂IRⁿ, is partially diﬀerentiable, then the di- mensionless quantityε_f,x_i(x) (partial elasticity) describes approximately the relative increase of the function value dependent from the relative increment of thei-th componentx_i:

ε_f,x_i(x) =f_x_i(x) x_i f(x)

i-th partial elasticity of the function f at the point x

Unconstrained extreme value problems 109

Relations involving partial elasticities n

i=1

x_i·∂f(x)

∂x_i =α·f(x₁, . . . , x_n) – Euler’s homogeneity relation;

f homogeneous of degreeα ε_f,x₁(x) +. . .+ε_f,x_n(x) =α – sum of partial elasticities

= degree of homogeneity

ε(x) =

⎛

⎜⎜

⎝

ε_f₁_,x₁(x) . . . ε_f₁_,x_n(x) ε_f₂_,x₁(x) . . . ε_f₂_,x_n(x) . . . . ε_f_m_,x₁(x) . . . ε_f_m_,x_n(x)

⎞

⎟⎟

⎠ – matrix of elasticities of the functionsf₁, . . . , f_m

• The quantities ε_f_i_,x_j(x) are called direct elasticities for i = j and cross elasticities fori=j.

Unconstrained extreme value problems

Given a suﬃciently often (partially) diﬀerentiable function f : D_f → IR, D_f ⊂IRⁿ. Find local extreme points x₀ of f (p. 46); assume that x₀ is an interior point ofD_f.

Necessary conditions for extrema

x₀local extreme point =⇒ ∇f(x₀) =0 ⇐⇒ f_x_i(x₀) = 0, i= 1, . . . , n x₀local minimum point =⇒ ∇f(x₀) =0∧H_f(x₀) positive semideﬁnite x₀local maximum point =⇒ ∇f(x₀) =0∧Hf(x₀) negative semideﬁnite

• Points x₀ with ∇f(x₀) = 0 are called stationary points of the function f. If in any neighbourhood of the stationary pointx₀ there are pointsx, y such thatf(x)< f(x₀)< f(y), then x₀ is said to be asaddle point of the functionf. A saddle point fails to be an extreme point.

• Boundary points ofD_fand points where the functionf is nondiﬀerentiable are to be considered separately (e. g. by analysing the function values of points in a neighbourhood ofx₀). For the notion of (semi-) deﬁniteness of a matrix p. 121.

Suﬃcient conditions for extrema

∇f(x₀) =0 ∧ Hf(x₀) positive deﬁnite =⇒ x₀ local minimum point

∇f(x₀) =0 ∧ H_f(x₀) negative deﬁnite =⇒ x₀ local maximum point

∇f(x₀) =0 ∧ H_f(x₀) indeﬁnite =⇒ x₀ saddle point

110 Diﬀerential calculus for functions of several variables

Special case n= 2 f(x) =f(x₁, x₂):

∇f(x₀) =0 ∧ A>0 ∧ f_x₁_x₁(x₀)>0 =⇒ x₀local minimum point

∇f(x₀) =0 ∧ A>0 ∧ f_x₁_x₁(x₀)<0 =⇒ x₀local maximum point

∇f(x₀) =0 ∧ A<0 =⇒ x₀saddle point

Here A = detH_f(x₀) = f_x₁_x₁(x₀)·f_x₂_x₂(x₀)−[f_x₁_x₂(x₀)]². For A = 0 a statement concerning the kind of the stationary pointx₀ cannot be made.

Constrained extrem value problems

Given the once or twice continuously (partially) diﬀerentiable functionsf : D→IR, g_i:D→IR, i= 1, . . . , m < n,D⊂IRⁿ, and letx= (x₁, . . . , x_n) . Find the local extreme points of the constrained extreme value problem

f(x) −→ max/min

g₁(x) = 0, . . . , g_m(x) = 0 (C)

• The setG={x∈D|g₁(x) = 0, . . . , g_m(x) = 0}is called theset of feasible points of problem (C).

• Let the regularity condition rank G = m be satisﬁed, where the m× n-matrix G denotes the functional matrix of the system of functions {g₁, . . . , g_m}, and the m linearly independent columns of G are numbered byi₁, . . . , i_m, the remaining columns beingi_m₊₁, . . . , i_n.

Elimination method

1. Eliminate the variablesxij,j= 1, . . . , m, from the constraintsgi(x) = 0, i= 1, . . . , m, of the problem (C): xij = ˜gij(xim+1, . . . , xin) .

2. Substitutex_i_j,j= 1, . . . , m, in the functionf:f(x) = ˜f(x_i_m₊₁, . . . , x_i_n).

3. Find the stationary points of ˜f (havingn−mcomponents) and deter- mine the kind of extremum ( conditions on p. 109).

4. Calculate the remainingmcomponents xij,j= 1, . . . , m, according to 1. in order to obtain stationary points of (C).

• All statements concerning the kind of extrema of ˜f remain in force with respect to problem (C).

Constrained extreme value problems 111

Lagrange’s method of multipliers

1. Assign to any constraintg_i(x) = 0 one (for the time being unknown) Lagrange multiplier λ_i ∈IR,i= 1, . . . , m.

2. Write down the Lagrange function associated with (C), where λ = (λ₁, . . . , λ_m) :

L(x,λ) =f(x) + m i=1

λ_ig_i(x) .

3. Find the stationary points (x₀,λ₀) of the functionL(x,λ) with respect to variablesx and λ from the (generally speaking, nonlinear) system of equations

Lxi(x,λ) = 0, i= 1, . . . , n; Lλi(x,λ) =gi(x) = 0, i= 1, . . . , m The pointsx₀ are then stationary for (C).

4. If then×n-matrix∇xx² L(x₀,λ₀) (x-part of the Hessian ofL) is positive deﬁnite over the setT={z∈IRⁿ | ∇g_i(x₀) z= 0, i= 1, . . . , m}, i. e.

z ∇xx² L(x₀,λ₀)z>0 ∀z ∈T, z=0,

thenx₀ yields a local minimum point for (C). In case of negative deﬁ- niteness of∇xx² L(x₀,λ₀),x₀is a local maximum point.

Economic interpretation of Lagrange multipliers Let the extreme pointx₀ of the (perturbed) problem

f(x)→max/min;

g_i(x)−b_i= 0, i= 1, . . . , m (C_b)

be unique for b = b₀, and let λ₀ = (λ⁰₁, . . . , λ⁰_m) be the vector of La- grange multipliers associated with x₀. Let, in addition, the regularity condition rankG =m (see p. 110) be fulﬁlled. Finally, let f^∗(b) denote the optimal value of problem (C_b) depending on the vector of the right-hand side b= (b₁, . . . , b_m) . Then

∂f^∗

∂b_i(b₀) =−λ⁰_i,

i. e., −λ⁰_i describes (approximately) the inﬂuence of the i-th component of the right-hand side on the change of the optimal value of problem (C_b).

112 Diﬀerential calculus for functions of several variables

Least squares method

Given the pairs (xi, yi), i = 1, . . . , N (xi – points of measurement or time, yi – measured values). Find a trend (or ansatz) function y = f(x,a) ap- proximating the measured values as good as possible, where the vectora= (a₁, . . . , a_M) contains theM parame- ters of the ansatz function to be deter-

mined in an optimal manner. x

r r r r

r r r

f(x,a) =a₁+a₂x

• The symbols [z_i] = N i=1

z_iare denoted asGaussian brackets.

S= N i=1

(f(xi,a)−yi)²−→min – error sum of squares to be minimized

N i=1

(f(x_i,a)−y_i)·∂f(x_i,a)

∂aj

= 0 – necessary conditions of minima (normal equations),j= 1,2, . . . , M

• The minimum conditions result from the relations _∂a^∂S

j = 0 and depend on the concrete form of the ansatz functionf. More general ansatz functions of the kindf(x,a) withx= (x₁, . . . , x_n) lead to analogous equations.

Some types of ansatz functions

f(x, a₁, a₂) =a₁+a₂x – linear function f(x, a₁, a₂, a₃) =a₁+a₂x+a₃x² – quadratic function f(x,a) =

M j=1

a_j·g_j(x) – generalized linear function

• In the above cases alinear system of normal equationsis obtained:

linear ansatz function quadratic ansatz function a₁·N +a₂·[x_i] = [y_i]

a₁·[x_i] +a₂·[x_i²] = [x_iy_i]

a₁·N +a₂·[x_i] +a₃·[x_i²] = [y_i] a₁·[x_i] +a₂·[x_i²] +a₃·[x_i³] = [x_iy_i] a₁·[x_i²] +a₂·[x_i³] +a₃·[x_i⁴] = [x_i²y_i] Explicit solution for linear ansatz functions

a₁= [x_i²]·[y_i]−[x_iy_i]·[x_i]

N·[x_i²]−[xi]² , a₂= N·[x_iy_i]−[x_i]·[y_i] N·[x_i²]−[xi]²

Propagation of errors 113

Simpliﬁcations

• By means of the transformation x_i = x_i−_N¹[x_i] the system of normal equations can be simpliﬁed since in this case [x_i] = 0.

• For the exponential ansatz y = f(x) = a₁·e^a²^x the transformation T(y) = lny leads (forf(x)>0) to a linear system of normal equations.

• For thelogistic functionf(x) =a·(1 +be⁻^cx)⁻¹(a, b, c >0) with known a the transformation ^a_y = be⁻^cx =⇒ Y = ln^a⁻_y^y = lnb−cx leads to a linear system of normal equations, when settinga₁= lnb,a₂=−c.

Propagation of errors

The propagation of errors investigates the inﬂuence of errors of the independent variables of a function on the result of function value calculation.

Notation

exact values – y, x₁, . . . , x_n withy=f(x) =f(x₁, . . . , x_n) approximate values – ˜y,x˜₁, . . . ,x˜n, where ˜y=f(˜x) =f(˜x₁, . . . ,x˜n) absolute errors – δy= ˜y−y, δx_i= ˜x_i−x_i, i= 1, . . . , n absolute error bounds – |δy| ≤∆y, |δx_i| ≤∆x_i, i= 1, . . . , n relative errors – δy

y , δx_i

x_i , i= 1, . . . , n relative error bounds –

δy y

≤∆y

|y|, δx_i

≤∆x_i

|xi|, i= 1, . . . , n

• If the function f is totally diﬀerentiable, then for the propagation of the errorsδx_iof the independent variables onto the absolute error of the function f one has:

∆y≈ ∂f(˜x)

∂x₁

∆x₁+. . .+ ∂f(˜x)

∂x_n ∆x_n

−bound for the absolute error off(˜x)

∆y

|y| ≈ x˜₁

y ·∂f(˜x)

∂x₁

· ∆x₁

|x₁| +. . .+ x˜n

y ·∂f(˜x)

∂x_n

·∆xn

|x_n|

−bound for the relative error off(˜x)

114 Diﬀerential calculus for functions of several variables

Economic applications

Cobb-Douglas production function

y=f(x) =c·xâ₁¹·xâ₂²·. . .·xâ_nⁿ x_i – input of thei-th factor

(c, ai, xi≥0) y – output

• The Cobb-Douglas function is homogeneous of degreer=a₁+. . .+a_n. Due to the relationf_x_i(x) =_x^aⁱ

if(x), i. e.ε_f,x_i(x) =a_i, the factor powersa_i are also referred to as(partial) production elasticities.

Marginal rate of substitution

When considering the level line to a production functiony=f(x₁, . . . , x_n) with respect to the levely₀(isoquant), one can ask the question by how many units the variablex_ihas (approximately) to be changed in order to substitute one unit of the k-th input factor under equal output and unchanged values of the remaining variables. Under certain assumptions an implicit function xk =ϕ(xi) is deﬁned (implicit function), the derivative of which is denoted as themarginal rate of substitution:

ϕ(xi) =−f_x_i(x) f_x_k(x)

marginal rate of substitution (of the factorkby the factori) Sensitivity of the price of a call option

The Black-Scholes formula P_call=P·Φ(d₁)−S·e⁻^iT ·Φ(d₂) withd₁= ¹

σ√ T

ln^P_S +T· i+^σ²

andd₂=d₁−σ√

T describes the price P_callof a call option on a share in dependence on the inputsP (actual price of the underlying share),S(strike price),i(riskless rate of interest, continuously compounded),T (remaining term of the option),σ² (variance per period of the share yield), where Φ is the distribution function of the standardized normal distribution andϕits density:ϕ(x) = √¹

2π·e⁻^x22 .

The change of the call price under a change ∆x_i of the i-th input (while keeping the remaining inputs ﬁxed) can be estimated with the help of the partial diﬀerential ∂P_call

∂x_i ·∆xi, where e. g.

∆=∂P_call

∂P =Φ(d₁)>0 – Delta; sensitivity of the call price w.r.t.

a change of share priceP Λ= P_call

∂σ =P·ϕ(d₁)·√

T >0 – Lambda; sensitivity of the call price w.r.t. a change of volatilityσ

Dalam dokumen Bernd Luderer · Volker Nollau Klaus Vetters (Halaman 111-123)