will include the limitτ0between them. This observation yields a convenient stopping criterion.
Example.The exact value of the integral
# π/2 0
5(eπ−2)−1e2xcosx dx
is 1. Using the polynomial extrapolation method of Romberg, and carrying 12 digits, we obtain for Tik, Uik, 0 ≤ i ≤ 6, 0 ≤ k ≤ 3 the values given in the following table.
i Ti0 Ti1 Ti2 Ti3
0 0.185 755 068 924
1 0.724 727 335 089 0.904 384 757 145
2 0.925 565 035 158 0.992 510 935 182 0.998 386 013 717
3 0.981 021 630 069 0.999 507 161 706 0.999 973 576 808 0.999 998 776 222 4 0.995 232 017 388 0.999 968 813 161 0.999 999 589 925 1.000 000 002 83 5 0.998 806 537 974 0.999 998 044 836 0.999 999 993 614 1.000 000 000 02 6 0.999 701 542 775 0.999 999 877 709 0.999 999 999 901 1.000 000 000 00
i Ui0 Ui1 Ui2 Ui3
0 1.263 699 601 26
1 1.126 402 735 23 1.080 637 113 22
2 1.036 478 224 98 1.006 503 388 23 1.001 561 139 90
3 1.009 442 404 71 1.000 430 464 62 1.000 025 603 04 1.000 001 229 44 4 1.002 381 058 56 1.000 027 276 51 1.000 000 397 30 0.999 999 997 211 5 1.000 596 547 58 1.000 001 710 58 1.000 000 006 19 0.999 999 999 978 6 1.000 149 217 14 1.000 000 107 00 1.000 000 000 09 1.000 000 000 00
The conditions (3.6.1) are met, for instance, if ω(x) is positive and con- tinuous on a finite interval [a, b]. Condition (3.6.1c) is equivalent to 3b
aω(x)dx >0 [see Exercise 14].
We will again examine integration rules of the type
(3.6.2) I(f˜ ) :=
n i=1
wif(xi).
The Newton-Cotes formulas [see Section 3.1] are of this form, but the ab- scissas xi were required to form a uniform partition of the interval [a, b].
In this section, we relax this restriction and try to choose the xi as well as thewi so as to maximize the order of the integration method, that is, to maximize the degree for which all polynomials are exactly integrated by (3.6.2). We will see that this is possible and leads to a class of well-defined so-calledGaussian integration rulesor Gaussian quadrature formulas [see for instance Stroud und Secrest (1966)]. These Gaussian integration rules will be shown to be unique and of order 2n−1. Alsowi >0 anda < xi< b fori= 1,. . .,n. In order to establish these results and to determine the ex- act form of the Gaussian integration rules, we need some basic facts about orthogonal polynomials. We introduce the notation
Π¯j :={p|p(x) =xj+a1xj−1+· · ·+aj}
for the set of normed real polynomials of degreej, and, as before, we denote by
Πj :={p| degree (p)≤j}
the linear space of all real polynomials whose degree does not exceedj. In addition, we define the scalar product
(f, g) :=
# b
a
ω(x)f(x)g(x)dx
on the linear spaceL2[a,b] of all functions for which the integral (f, f) =
# b
a
ω(x)f(x)2dx
is well defined and finite. The functionsf,g∈L2[a, b] are calledorthogonal if (f, g) = 0. The following theorem establishes the existence of a sequence of mutually orthogonal polynomials, the system oforthogonal polynomials associated with the weight functionω(x).
(3.6.3) Theorem.There exist polynomialspj∈Π¯j,j = 0,1,2,. . ., such that
(3.6.4) (pi, pk) = 0 for i=k.
These polynomials are uniquely defined by the recursions (3.6.5a) p0(x)≡1,
(3.6.5b) pi+1(x)≡(x−δi+1)pi(x)−γi+12 pi−1(x) fori≥0, wherep−1(x) :≡0 and7
(3.6.6a) δi+1:= (x pi, pi)/(pi, pi) fori≥0, (3.6.6b) γi+12 :=
1 fori= 0, (pi, pi)/(pi−1, pi−1) fori≥1.
Proof. The polynomials can be constructed recursively by a technique known as Gram-Schmidt orthogonalization. Clearly p0(x) ≡ 1. Suppose then, as an induction hypothesis, that all orthogonal polynomials with the above properties have been constructed for j ≤ i and have been shown to be unique. We proceed to show that there exists a unique polynomial pi+1∈Π¯i+1with
(3.6.7) (pi+1, pj) = 0 for j ≤i,
and that this polynomial satisfies (3.6.5b). Any polynomial pi+1 ∈ Π¯i+1 can be written uniquely in the form
pi+1(x)≡(x−δi+1)pi(x) +ci−1pi−1(x) +ci−2pi−2(x) +· · ·+c0p0(x), because its leading coefficient and those of the polynomialspj,j≤i, have value 1. Since (pj,pk) = 0 for allj,k≤iwithj =k, (3.6.7) holds if and only if
(3.6.8a) (pi+1, pi)= (xpi, pi)−δi+1(pi, pi) = 0,
(3.6.8b) (pi+1, pj−1)= (xpj−1, pi) +cj−1(pj−1, pj−1) = 0 forj≤i.
The condition (3.6.1c) — withp2i andp2j−1, respectively, in the role of the nonnegative polynomials— rules out (pi, pi) = 0 and (pj−1, pj−1) = 0 for 1≤j ≤i. Therefore, the equations (3.6.8) can be solved uniquely. (3.6.8a) gives (3.6.6a). By the induction hypothesis,
pj(x)≡(x−δj)pj−1(x)−γj2pj−2(x)
forj ≤i. From this, by solving forx pj−1(x), we have (xpj−1, pi) = (pj, pi) forj≤i, so that
cj−1=− (pj, pi) (pj−1, pj−1) =
−γi+12 forj=i, 0 forj < i.
in view of (3.6.8). Thus (3.6.5b) has been established fori+ 1.
7 x pi denotes the polynomial with valuesxpi(x) for allx.
Every polynomialp∈Πk is clearly representable as a linear combina- tion of the orthogonal polynomialspi,i≤k. We thus have:
(3.6.9) Corollary.(p, pn) = 0for allp∈Πn−1.
(3.6.10) Theorem.The rootsxi,i= 1,. . .,n, ofpn are real and simple.
They all lie in the open interval(a, b).
Proof.Consider those roots ofpnwhich lie in (a, b) and which are of odd multiplicity, that is, at whichpn changes sign:
a < x1<· · ·< xl< b.
The polynomial
q(x) :=
!l j=1
(x−xj)∈Π¯l
is such that the polynomialpn(x)q(x) does not change sign in [a, b], so that (pn, q) =
# b
a
ω(x)pn(x)q(x)dx= 0
by (3.6.1c). Thus degree (q) =l=nmust hold, as otherwise (pn, q) = 0 by
Corollary (3.6.9).
Next we have the
(3.6.11) Theorem.The n×n-matrix
A:=
p0(t1) . . . p0(tn)
... ...
pn−1(t1) . . . pn−1(tn)
is nonsingular for mutually distinct argumentsti,i= 1, . . .,n.
Proof.AssumeA is singular. Then there is a vectorcT = (c0, . . . , cn−1), c= 0 withcTA= 0. The polynomial
q(x) :=
n−1
i=0
cipi(x),
with degree (p) < n, has then distinct rootst1,. . ., tn and must vanish identically. Since the polynomialspi(.) are linearly independent,q(x)≡0
implies the contradictionc= 0.
Theorem (3.6.11) shows that the interpolation problem of finding a function of the form
p(x)≡
n−1
i=0
cipi(x)
with p(ti) = fi, i = 1, 2, . . ., n is always solvable. The condition of the theorem is known as the Haar condition.Any sequence of functions p0, p1, . . .which satisfy the Haar condition is said to form a Chebyshev sys- tem.Theorem (3.6.11) states that sequences of orthogonal polynomials are Chebyshev systems.
Now we arrive at the main result of this section.
(3.6.12) Theorem.
(a) Letx1,. . .,xnbe the roots of thenthorthogonal polynomialpn(x), and letw1,. . .,wn be the solution of the (nonsingular) system of equations
(3.6.13)
n i=1
pk(xi)wi=
(p0, p0) if k= 0,
0 if k= 1, 2,. . .,n−1.
Thenwi>0 fori= 1,2,. . .,n, and
(3.6.14)
# b
a
ω(x)p(x)dx= n i=1
wip(xi)
holds for all polynomialsp∈Π2n−1. The positive numberswiare called
“weights”.
(b) Conversely, if the numberswi,xi,i= 1,. . .,n, are such that (3.6.14) holds for allp∈Π2n−1, then thexi are the roots ofpn and the weights wi satisfy(3.6.13).
(c) It is not possible to find numbersxi,wi,i= 1,. . .,n, such that(3.6.14) holds for all polynomialsp∈Π2n.
Proof.By Theorem (3.6.10), the rootsxi,i= 1,. . .,n, ofpnare real and mutually distinct numbers in the open interval (a, b). The matrix
(3.6.15) A:=
p0(x1) . . . p0(xn)
... ...
pn−1(x1) . . . pn−1(xn)
is nonsingular by Theorem (3.6.11), so that the system of equations (3.6.13) has a unique solution.
Consider an arbitrary polynomialp∈Π2n−1. It can be written in the form
(3.6.16) p(x)≡pn(x)q(x) +r(x),
where q, rare polynomials inΠn−1, which we can express as linear com- binations of orthogonal polynomials
q(x)≡n−1
k=0
αkpk(x), r(x)≡n−1
k=0
βkpk(x).
Sincep0(x)≡1, it follows from (3.6.16) and Corollary (3.6.9) that
# b
a
ω(x)p(x)dx= (pn, q) + (r, p0) =β0(p0, p0).
On the other hand, by (3.6.16) [sincepn(xi) = 0] and by (3.6.13), n
i=1
wip(xi) = n i=1
wir(xi) =
n−1
k=0
βk n
i=1
wipk(xi)
=β0(p0, p0), Thus (3.6.14) is satisfied.
We observe that
(3.6.17). If wi, xi, i = 1, . . ., n, are such that (3.6.14) holds for all polynomialsp∈Π2n−1, thenwi>0 fori= 1, . . .,n.
This is readily verified by applying (3.6.14) to the polynomials
¯ pj(x) :=
!n
h=1h=j
(x−xh)2∈Π2n−2, j= 1, . . . , n,
and noting that 0<
# b
a
ω(x)¯pj(x)dx= n
i=1
wip¯j(xi) =wj
!n
h=1h=j
(xj−xh)2
by (3.6.1c). This completes the proof of (3.6.12a).
We prove (3.6.12c) next. Assume that wi, xi, i = 1, . . ., n, are such that (3.6.14) even holds for all polynomialsp∈Π2n. Then
¯
p(x) :≡!n
j=1
(x−xj)2∈Π2n
contradicts this claim, since by (3.6.1c) 0<
# b
a
ω(x)¯p(x)dx= n
i=1
wip(x¯ i) = 0.
This proves (3.6.12c).
To prove (3.6.12b), suppose that wi, xi, i = 1, . . ., n, are such that (3.6.14) holds for allp∈4
2n−1. Note that the abscissasxi must be mutu- ally distinct, since otherwise we could formulate the same integration rule using onlyn−1 of the abscissasxi, contradicting (3.6.12c).
Applying (3.6.14) to the orthogonal polynomialsp = pk, k = 0, . . ., n−1, themselves, we find
n i=1
wipk(xi) =
# b
a
ω(x)pk(x)dx= (pk, p0) =
(p0, p0), ifk= 0,
0, if 1≤k≤n−1.
In other words, the weightswi must satisfy (3.6.13).
Applying (3.6.14) to p(x) :≡ pk(x)pn(x), k = 0, . . ., n−1, gives by (3.6.9)
0 = (pk, pn) = n i=1
wipn(xi)pk(xi), k= 0, . . . , n−1.
In other words, the vector c := (w1pn(x1), . . ., wnpn(xn))T solves the homogeneous system of equations Ac = 0 with A the matrix (3.6.15).
Since the abscissas xi, i= 1, . . ., n, are mutually distinct, the matrix A is nonsingular by Theorem (3.6.11). Therefore c= 0 and wipn(xi) = 0 for i= 1, . . ., n. Sincewi >0 by (3.6.17), we have pn(xi) = 0,i= 1, . . .,n.
This completes the proof of (3.6.12b).
For the most common weight functionω(x) :≡1 and the interval [−1,1], the results of Theorem (3.6.12) are due to Gauss. The corresponding or- thogonal polynomials are [see Exercise 16]
(3.6.18) pk(x) := k!
(2k)!
dk
dxk(x2−1)k, k= 0,1, . . . .
Indeed,pk∈Π¯k and integration by parts establishes (pi, pk) = 0 fori=k.
Up to a factor, the polynomials (3.6.18) are the Legendre polynomials. In the following table we give some values forwi,xiin this important special case. For further values see the National Bureau of StandardsHandbook of Mathematical Functions[Abramowitz and Stegun (1964)].
n wi xi
1 w1= 2 x1 = 0
2 w1=w2 = 1 x2 =−x1= 0.577 350 2692. . . 3 w1=w3 = 5
9 x3 =−x1= 0.774 596 6692. . . w2= 8
9 x2 = 0
4 w1=w4 = 0.347 854 8451. . . x4 =−x1= 0.861 136 3116. . . w2=w3 = 0.652 145 1549. . . x3 =−x2= 0.339 981 0436. . . 5 w1=w5 = 0.236 926 8851. . . x5 =−x1= 0.906 179 8459. . . w2=w4 = 0.478 628 6705. . . x4 =−x2= 0.538 469 3101. . . w3= 128225 = 0.568 888 8889. . . x3 = 0
Other important cases which lead to Gaussian integration rules are listed in the following table:
[a, b] ω(x) Orthogonal polynomials
[−1,1] (1−x2)−1/2 Tn(x), Chebyschev polynomials [0,∞] e−x Ln(x), Laguerre polynomials [−∞,∞] e−x2 Hn(x), Hermite polynomials
We have characterized the quantitieswi,xi which enter the Gaussian integration rules for given weight functions, but we have yet to discuss methods for their actual calculation. We will examine this problem un- der the assumption that the coefficients δi,γi of the recursion (3.6.5) are given. Golub and Welsch (1969) and Gautschi (1968, 1970) discuss the much harder problem of finding the coefficientsδi,γi.
The theory of orthogonal polynomials ties in with the theory of real tridiagonal matrices
(3.6.19) Jn =
 δ1 γ2 γ2 δ2 ·
· · ·
· · γn γn δn
and their principal submatrices
Jj :=
 δ1 γ2 γ2 δ2 ·
· · ·
· · γj γj δj
Such matrices will be studied in Sections 5.5, 5.6, and 6.6.1. In Section 5.5 it will be seen that the characteristic polynomials pj(x) = det(Jj−xI) of theJj satisfy the recursions (3.6.5) with the matrix elementsδj,γj as the coefficients. Therefore,pnis the characteristic polynomial of the tridiagonal matrixJn. Consequently we have
(3.6.20) Theorem. The roots xi, i = 1, . . ., n, of the nth orthogonal polynomialpn are the eigenvalues of the tridiagonal matrix Jn in(3.6.19).
The bisection method of Section 5.6, theQRmethod of Section 6.6.6, and others are available to calculate the eigenvalues of these tridiagonal systems.
With respect to the weightswi, we have [Szeg¨o (1959), Golub and Welsch (1969)]:
(3.6.21) Theorem. Let v(i) := (v(i)1 , . . . , vn(i))T be an eigenvector of Jn (3.6.19) for the eigenvalue xi, Jnv(i) = xiv(i). Suppose v(i) is scaled in such a way that
v(i)Tv(i)= (p0, p0) =
# b
a
ω(x)dx.
Then the weights are given by
wi= (v(i)1 )2, i= 1, . . . , n.
Proof.We verify that the vector
˜
v(i)= (ρ0p0(xi), ρ1p1(xi), . . . , ρn−1pn−1(xi))T where
ρj := 1/(γ1γ2· · ·γj+1) forj= 0, 1,. . .,n−1
is an eigenvector ofJn for the eigenvaluexi:Jnv˜(i)=xi˜v(i). By (3.6.5) for anyx,
δ1ρ0p0(x) +γ2ρ1p1(x) =δ1p0(x) +p1(x) =xp0(x) =xρ0p0(x).
Forj= 2,. . .,n−1, similarly
γjρj−2pj−2(x) +δjρj−1pj−1(x) +γj+1ρjpj(x)
=ρj−1[γj2pj−2(x) +δjpj−1(x) +pj(x)]
=xρj−1pj−1(x), and finally
ρn−1[γn2pn−2(x) +δnpn−1(x)] =xρn−1pn−1(x)−ρn−1pn(x), so that
γnρn−2pn−2(xi) +δnpn−1(xi)] =xiρn−1pn−1(xi)
holds, providedpn(xi) = 0.
Sinceρj = 0,j= 0, 1,. . ., n−1, the system of equations (3.6.13) for wi is equivalent to
(3.6.22) (˜v(1), . . . ,v˜(n))w= (p0, p0)e1 withw= (w1, . . . , wn)T,e1= (1,0, . . . ,0)T.
Eigenvectors of symmetric matrices for distinct eigenvalues are orthog- onal. Therefore, multiplying (3.6.22) byv(i)T from the left yields
(˜v(i)Tv˜(i))wi= (p0, p0)˜v(i)1 . Sinceρ0= 1 andp0(x)≡1, we have ˜v(i)1 = 1. Thus (3.6.23) (˜v(i)Tv˜(i))wi= (p0, p0).
Using again the fact that ˜v1(i)= 1, we findv(i)1 ˜v(i)=v(i), and multiplying (3.6.23) by (v(i)1 )2 gives
(v(i)Tv(i))wi= (v(i)1 )2(p0, p0).
Sincev(i)Tv(i)= (p0, p0) by hypothesis, we obtainwi= (v1(i))2. If theQR-method is employed for determining the eigenvalues of Jn, then the calculation of the first components v(i)1 of the eigenvectors v(i) is readily included in that algorithm: calculating the abscissas xi and the weightswi can be done concurrently [Golub and Welsch (1969)].
Finally, we will estimate the error of Gaussian integration:
(3.6.24) Theorem.If f ∈C2n[a, b], then
# b
a
ω(x)f(x)dx−n
i=1
wif(xi) =f(2n)(ξ)
(2n)! (pn, pn) for someξ∈(a, b).
Proof. Consider the solution h ∈ Π2n−1 of the Hermite interpolation problem [see Section 2.1.5]
h(xi) =f(xi), h(xi) =f(xi), i= 1,2, . . . , n.
Since degreeh <2n,
# b
a
ω(x)h(x)dx= n i=1
wih(xi) = n i=1
wif(xi)
by Theorem (3.6.12). Therefore, the error term has the integral represen- tation
# b
a
ω(x)f(x)dx−n
i=1
wif(xi) =
# b
a
ω(x)(f(x)−h(x))dx.
By Theorem (2.1.5.9), and since thexi are the roots ofpn(x))∈Π¯n, f(x)−h(x) = f(2n)(ζ)
(2n)! (x−x1)2· · ·(x−xn)2=f(2n)(ζ) (2n)! p2n(x) for someζ=ζ(x) in the intervalI[x, x1, . . . , xn] spanned byxandx1,. . ., xn. Next,
f(2n)(ζ(x))
(2n)! = f(x)−h(x) p2n(x)
is continuous on [a, b] so that the mean-value theorem of integral calculus applies:
# b
a
ω(x)(f(x)−h(x))dx= 1 (2n)!
# b
a
ω(x)f(2n)(ζ(x))p2n(x)dx
= f(2n)(ξ)
(2n)! (pn, pn)
for someξ∈(a, b).
Comparing the various integration rules (Newton-Cotes formulas, ex- trapolation methods, Gaussian integration), we find that, computational efforts being equal, Gaussian integration yields the most accurate results.
If only one knew ahead of time how to chosen so as to achieve specified accuracy for any given integral, then Gaussian integration would be clearly superior to other methods. Unfortunately, it is frequently not possible to use the error formula (3.6.24) for this purpose, because the 2nth derivative is difficult to estimate. For these reasons, one will usually apply Gaussian integration for increasing values of n until successive approximate values agree within the specified accuracy. Since the function values which had been calculated forncannot be used forn+ 1 (at least not in the classical caseω(x)≡1), the apparent advantages of Gauss integration as compared with extrapolation methods are soon lost. There have been attempts to remedy this situation [e.g. Kronrod (1965)]. A collection offortran pro- grams is given in Piessens et al. (1983).