Error Bounds - Systems of Linear Equations

Systems of Linear Equations

4.4 Error Bounds

det(A) > 0 (4.3.2), and det(Ln-d > O. Therefore, from (4.3.4), there exists exactly one IY. > 0 giving LLH

=

A, namely

o

The decomposition A = LLH can be determined in a manner similar to the methods given in Section 4.1. If it is assumed that allliJ are known for j ~ k - 1, then as defining equations for lkk and lik' i ~ k

+

1, we have

(4.3.5) akk = Ilk!

12 + Pk212 + ... +

Ilkk

12,

aik

=

Iii Tk!

+

^li21,.2

+ ... +

lik Tkk · from A = LLH.

For a real A, the following algorithm results:

for i ^:=1 step 1 until n do for j ^:=i step 1 until n do begin x ^:=a[i, j];

end i, j;

for k := i-I step - 1 until 1 do x :=x - a(j, k] x a[i, k];

if i = j then begin end else

if x ~ 0 then goto fail ; p[i] ^:=l/sqrt (x) a(j, i] ^:=x x p[i]

Note that only the information from the upper triangular portion of A is used. The lower triangular matrix L is stored in the lower triangular portion of A, with the exception of the diagonal elements of L, whose reciprocals are stored in p.

This method is due to Cholesky. During the course of the computation, n square roots must be taken. Theorem (4.3.3) assures us that the arguments of these square roots will be positive. About n³

/6

operations (multiplications and additions) are needed beyond the n square roots. Finally, note as an important implication of (4.3.5) that

(4.3.6) Iikil ~ ~, j

=

1, ... , k, k

=

1, ... , n.

That is, the elements of L cannot grow too large.

the question of how the accuracy of

x

is judged. In order to measure the error x-x

we have to have the means of measuring the" size" of a vector. To do this, a

(4.4.1 ) norm: Ilxll

is introduced on

en;

that is, a function 11·11 :

en

^-->IR,

which assigns to each vector x ^EO

en

a real value Ilxll serving as a measure for the" size" of x. The function must have the following properties:

(4.4.2)

(a) Ilxll > 0 for all x EO en, x

+

⁰(positivity),

(b) Ilcexll = la Illxll for alia ^EO

e,

x ^EO

en

(homogeneity), (c) Ilx

+

yll

s

Ilxll

+

Ilyll for all x, y ^EO

en

(triangle inequality).

In the following we use only the norms

(4.4.3) IIxl12

:=J?X

⁼ ^JtllxJ (Euclidian norm), Ilxlloo := max I xd (maximum norm).

The norm properties (a), (b), (c) are easily verified.

For each norm 11·11 the inequality

(4.4.4) Ilx - yll ~ Illxll - Ilylll for all x,

y

^EO

en

holds. From (4.4.2c) it follows that

Ilxll = II(x - y)

+

YII ~ Ilx - yll

+

IIYII,

and consequently Ilx - YII ~ Ilx II - Ilyli. By interchanging the roles of x and y and using (4.4.2b), it follows that

Ilx - yll = Ily - xii ~ IIYII - Ilxll, and hence (4.4.4).

It is easy to establish the following:

(4.4.5) Theorem. Each norm 11·11 on IRn (or

en)

is a uniformly continuous function with respect to the metric p(x, y) = maxi I Xi - Yi I on IRn (en).

PROOF. From (4.4.4) it follows that

Illx

+

hll - Ilxlll ~ Ilhll·

Now h =

LI=

¹hi ei' where h = (hl' ... , hnr, and ei are the usual coordinate (unit) vectors of IRn(IC"). Therefore

n n

Ilhll::; _i=l

L

^Ihililedl^~^maxlhil_i

L

^Ilejll

⁼

^{M maxlhil}

j=l i

with M ^:=

LJ=

¹Ilej II· Hence, for each I:: >

°

and all h satisfying maxi I hi I ~

B/M, the inequality

Illx

+

hll - Ilxlll ~ ^I::

holds. That is, II . II is uniformly continuous.

This result is used to show:

(4.4.6) Theorem. All norms on IRn(lC") are equivalent in thefollowing sense: For each pair of norms Pl (x), P2(X) there are positive constants m and M satisfying

mp2(x) ~ Pl(X) ~ Mp2(X) for all x.

PROOF. We will prove this only in the case that P2(X):= Ilxll ^:=maxi I Xi I. The general case follows easily from this special result. The set

S

⁼

~x

enlm~x

^Ixil⁼

1)

is a compact set in IC". Since Pl(X) is continuous by (4.4.5), maxxES Pl(X) = M >

°

and minxES pdx) = m >

°

exist. Thus, for all y

+-

^0,

since Y/IIYII E S, it follows that

~

^Pl

(II~II)

⁼

irr

^Pl^(y)

^~

^M,

and therefore m11Y11 ~ Pl(Y) ~ MIIYII· D

F or matrices as well, A E M (m, n) of fixed dimensions, norms II A II can be introduced. In analogy to (4.4.2), the properties

IIAII >

°

for all A

+-

^0,^{A E M(m,}^n),

IlaA11 = lalllAII, IIA

+

BII ~ IIAII

+

IIBII

are required. The matrix norm 11·11 is said to be consistent with the vector norms 11·lla on

en

and II· lib on em if

IIAxllb ~ IIAII Ilxll a for all x E IC", ^AE M(m, n).

A matrix norm 11·11 for square matrices A E M(n, n) is called submultipli- cative if

IIABII ~ IIAII IIBII for all A, BE M(n, n).

Frequently used matrix norms are

(4.4.7a) IIAII = max

L n laikl

(row-sum norm),

k=l

(4.4.7b)

(n f/2

IIAII =

i.~1IaikI2

(Schur norm), (4.4.7c) IIAII = max

laikl·

k

(a) and (b) are submultiplicative; (c) is not; (b) is consistent with the Eucli- dian vector norm. Given a vector norm Ilxll, a corresponding matrix norm for square matrices, the subordinate matrix norm, can be defined by

(4.4.8 ) lub(A):=max IIAxll.

xtO Ilxll

Such a matrix norm is consistent with the vector norm 11'11 used to define it:

(4.4.9) IIAxl1 ~ lub(A) Ilxll·

Obviously lub(A) is the smallest of all of the matrix norms IIAII which are consistent with the vector norm II x II :

IIAxl1 ~ IIAII 'll-xll for all x

=

^lub(A)~ IIAII·

Each subordinate norm lub(· ) is submultiplicative:

lub(AB) = max IIABxl1 = max IIA(Bx)IIIIBxll

xtO Ilxll ^xtO IIBxl1 Ilxll IIAYII IIBxl1

~ max -II-II max -II -II = lub(A) lub(B),

ytO Y ^xtO X

and furthermore lub(I) = maxxtO IIIxll/llxl1 = 1.

(4.4.9) shows that lub(A) is the greatest magnification which a vector may attain under the mapping determined by A: It shows how much IIAxll, the norm of an image point, can exceed Ilxll, the norm of a source point.

EXAMPLE.

(a) For the maximum norm

Ilxlloo

= max

v Ixvl

the subordinate matrix norm is the row-sum norm

(b) Associated with the Euclidian norm

IIxl12

j--;iI--;;

we have the subordinate matrix norm

which is expressed in terms of the largest eigenvalue Amax(AH A) of the matrix AHA. With regard to this matrix norm, we note that

(4.4.10) lub(U) = 1

for unitary matrices U, that is, for matrices defined by U^HU = I.

In the following we assume that II x II is an arbitrary vector norm and II A II is a consistent submultiplicative matrix norm. Specifically, we can always take the subordinate norm lub(A) as IIAII if we want to obtain particularly good estimates in the results below. We shall show how norms can be used to bound the influence due to changes in A and b on the solution x to a linear equation system

Ax= b.

If the solution x

+

dx corresponds to the right-hand side b

+

^db,

A(X+dx)=b+db, then the relation

dx=A-¹ db follows from A dx

=

db, as does the bound (4.4.11 ) _Ildxll_~IIA -111 Ildbll·

For the relative change Ildxll/llxll, the bound

(4.4.12) Illtllll

~

^IIAII

IIA-IIIIII~~111 =

cond(A)

III~~III

follows from

Ilbll

⁼

IIAxl1 :;;;; IIAII Ilxll·

In this estimate, cond(A) ,=

IIA II IIA -111· For the special case that cond(A) ,= lub(A) lub(A -1), this so- called condition of A is a measure of the sensitivity of the relative error in the solution to changes in the right-hand side b. Since AA -1 = I, cond(A) satisfies

lub(I) = 1 ~ lub(A) lub(A-^{1 )}~ IIAIIIIA-111 = cond(A).

The relation (4.4.11) can be interpreted as follows: If

x

is an approximate solution to Ax = b with residual

r(x)

=

^{b - Ax}

=

^{A(x - x),}

then

x

is the exact solution of

=

b - r(x), and the estimate

(4.4.13)

must hold for the error dx = x-x.

Next, in order to investigate the influence of changes in the matrix A upon the solution x of Ax = b, we establish the following

(4.4.14) Lemma. If' F is an n x n matrix with IIFII < 1, then (I

+

^Ft¹^exists

and satisfies

11(1

+

Ftlll

~

1 _111F11'

PROOF. From (4.4.4) the inequality

11(1

+

F)xll = Ilx

+

Fxll ~ Ilxll - IIFxl1 ~ (1 - 11F11)llxll

follows for all x. From 1 - II F II >

°

it follows that II (I

+

F)x II >

°

^{if x}⁼¹⁼^0;

that is, (I

+

F)x =

°

has only the trivial solution x = 0, and I

+

F is nonsingular.

Using the abbreviation e ^:=(I

+

F)-1, it follows that 1 = 11111 = 11(1

+

F)CII = lie

+

FCII

~ IICII - IICII IIFII

= IICII(1 - IIFII) > 0, from which we have the desired result

We can now show:

(4.4.15) Theorem. Let A be a nOllsingu/ar 11 x n matrix, B = A(I

+

F), IIF/I < 1, and x and ~x be defined by Ax = b, B(x

+

~x) = b. It/ollows that

/I~xll /IF/I

~II ~

l-/iF/I' as well as

_11~_xll ~ _ _ c_on_d...,(A.,---) _~ ^1/B - A II Ilxll '" 1 _ cond(A)

liB -

All IIAII

IIAII

;f'cond(A)'

liB -

AIIII/AII < 1.

PROOF. B-¹exists from (4.4.14), and

~x = B-lb - A-lb = B-l(A - B)A-lb, x = A-lb,

II~xll ~

^{IIB-l(A _}^B)II⁼^11-^(I

⁺

^{F)-l A}^-1^AFII

Ilxll "

-1 IIFII

~

11(1

+

F) IIIIFII

~

1 - IIFII'

Since F = A -l(B - A) and IIFII ~ IIA -111 II A II liB - A II/IIA II, the rest of the

theorem follows. D

According to Theorem (4.4.15), cond(A) also measures the sensitivity of the solution x of Ax = b to changes in the matrix A.

If the relations

=

+

F)-1

=

B-¹A,

F=A-¹B - [

are taken into account, it follows from (4.4.14) that 1

IIB- 1AII

~ I-iII _

^A ^1BII'

By interchanging A and B, it follows immediately from A -1 = A -1 BB-1 that

(4.4.16)

In particular, the residual estimate (4.4.13) leads to the bound

(4.4.17) ^_ IIB- 11 1 _

Ilx - xii ~

1 _ iII _

^B-¹A 1IIIr(x)ll, r(.x) = b - A.x, where B-¹is an approximate inverse to A with II [ - B-¹A II < 1.

The estimates obtained up to this point show the significance of the quantity cond(A) for determining the influence on the solution of changes in the given data. These estimates give bounds on the error

.x -

x, but the evaluation of the bounds requires at least an approximate knowledge of the inverse A -1 to A. The estimates to be discussed next, due to Prager and Oettli (1964), are based upon another principle and do not require any knowledge of A - 1.

The results are obtained through the following considerations:

Usually the given data Ao, bo of an equation system Ao x = _boare inex- act, being tainted, for example, by measurement errors i1A, i1b. Hence, it is reasonable to accept an approximate solution

.x

to the system Ao x = _boas

"correct" if

x

is the exact solution to a "neighboring" equation system with

(4.4.18)

Ax = b

A ^E'll := {A

II

^{A - Ao}

I

^~^i1A}

b ^E~ ^:={b

II

^{b - bo}

I

^~^i1b}.

The notation used here is

I A

I

= (I Clik I), where A = (Cl ik ),

Ibl =

(IP11, .. ·IPnl )T, where b

= (Pb'" Pnr,

and the relation ~ between vectors and matrices is to be understood as holding componentwise. Prager and Oettli prove:

(4.4.19) Theorem. Let ~A ): 0, ~b ): 0, and let 'Jl, !D be defined by (4.4.18).

Associated with any approximate solution

x

of the system Ao x = bo there is a matrix A E 'Jl and a vector b E !D satisfying

Ax = b, it" and only if

1 r(x) ¹~ ~A ¹

x

+

~b, where r(.x-) := bo - Ao.x- ^isthe residual of .x-.

PROOF.

(1) We assume first that

A.x-= b.

holds for some A E 'Jl, b E !D. Then it follows from

that

A = Ao

+

6A, where 16A 1 ~ ~A, b = _bo

+

6b, where 16b ¹~ ~b,

1 r(.x-) 1 = Ibo - Ao·xl = Ib - 6b - (A - 6A)·x-1

= 1-6b+(6A}.X1 ~

1

^6b

l + 1

^6A

ll x l"

~ ~b

+

~Alxl.

(2) On the other hand, if (4.4.20)

and if I' and s stand for the vectors

1':= r(.x-) = (Pi' ... , Pnr, s ^:=~b

+

I·x 1 ):

^0,

then set

6r:xij:= Pi ~r:xij sign(U/O"i'

6f3i := - Pi ~f3JO"i' where pJO"i:= 0 if O"i = O.

From (4.4.20) it follows that

I

Pi /O"i

I

^~1, and consequently A = Ao

+

^6A^E^'Jl, ^b= bo

+

^6b^E^!D

as well as the following for i = 1, 2, ... , n:

that is,

Pi = Pi - _j=l

f ^(Xij~j

⁼

^(I1

^Pi

⁺

_j=l

f

^l1(Xijl

^~jl) ^f!.i

_(Ji

= -bPi

+ L

b(Xij~j, j=l

L

n ^((Xij

+

b(XiJ~j = Pi

+

^bPi,

j=l

Ai = b,

which was to be shown. D

The criterion expressed in Theorem (4.4.19) permits us to draw conclu- sions about the fitness of a solution from the smallness of its residual. For example, if all components of Ao and bo have the same relative accuracy s,

then (4.4.19) is satisfied if

IAox-bol :(s(lbol + IAoll·xl)·

From this inequality, the smallest s can be computed for which a given

.x

^can

still be accepted as a useable solution.

4.5 Roundoff-Error Analysis for Gaussian

Dalam dokumen R. Bartels, W. Gautschi, and C. Witzgall (Halaman 185-193)