Directory UMM :Data Elmu:jurnal:A:Advances In Water Resources:Vol23.Issue8.2000:

(1)

A least-squares penalty method algorithm for inverse problems of

steady-state aquifer models

Hailong Li

*

_{, Qichang Yang}

Institute of Biomathematics, Anshan Normal College, Anshan 114005, People's Republic of China

Abstract

Based on the generalized Gauss±Newton method, a new algorithm to minimize the objective function of the penalty method in (Bentley LR. Adv Wat Res 1993;14:137±48) for inverse problems of steady-state aquifer models is proposed. Through detailed analysis of the ``built-in'' but irregular weighting eects of the coecient matrix on the residuals on the discrete governing equations, a so-called scaling matrix is introduced to improve the great irregular weighting eects of these residuals adaptively in every Gauss± Newton iteration. Numerical results demonstrate that if the scaling matrix equals the identity matrix (i.e., the irregular weighting eects of the coecient matrix are not balanced), our algorithm does not perform well, e.g., the computation cost is higher than that of the traditional method, and what is worse is the calculations fail to converge for some initial values of the unknown parameters. This poor situation takes a favourable turn dramatically if the scaling matrix is slightly improved and a simple preconditioning technique is adopted: For naturally chosen simple diagonal forms of the scaling matrix and the preconditioner, the method performs well and gives accurate results with low computational cost just like the traditional methods, and improvements are obtained on: (1) widening the range of the initial values of the unknown parameters within which the minimizing iterations can converge, (2) re-ducing the computational cost in every Gauss±Newton iteration, (3) improving the irregular weighting eects of the coecient matrix of the discrete governing equations. Consequently, the example inverse problem in Bentley (loc. cit.) is solved with the same accuracy, less computational eort and without the regularization term containing prior information on the unknown parameters. Moreover, numerical example shows that this method can solve the inverse problem of the quasilinear Boussinesq equation almost as fast as the linear one.

In every Gauss±Newton iteration of our algorithm, one needs to solve a linear least-squares system about the corrections of both the parameters and the groundwater heads on all the discrete nodes only once. In comparison, every Gauss±Newton iteration of the traditional method has to solve the discrete governing equations as many times as one plus the number of unknown parameters or head observation wells (Yeh WW-G. Wat Resour Res 1986;22:95±108).

All these facts demonstrate the potential of the algorithm to solve inverse problems of more complicated non-linear aquifer models naturally and quickly on the basis of ®nding suitable forms of the scaling matrix and the preconditioner. Ó _{2000 Elsevier}

Keywords: Penalty method; Inverse problem; Scaling matrix; Preconditioner; Aquifer models; LSQR algorithm; Parameter identi®cation; Gauss±Newton method; Finite element method

1. Introduction

The traditional least-square methods for distributed parameter identi®cation of groundwater models mini-mize a quadratic objective function with respect to the parameters subject to the discrete governing equations of an elliptic or parabolic aquifer model (see e.g., [4±

6,8,9,14±16,21]). This means that the discrete equations of the aquifer models are regarded as an exact descrip-tion of the real aquifer. But in practice there exist not only measurement errors of heads but also errors of the conceptual models, of the discretization of the models, of the determination of boundary conditions and source terms etc. Hence the concept of measurement error must be generalized such that it includes also the other errors mentioned above. Such a generalization of the measurement error concept is particularly important in least-squares and maximum likelihood methods of groundwater inverse problems. For this reason, exact enforcement of the discrete governing equations may

www.elsevier.com/locate/advwatres

*

Corresponding author. Present address: Department of Earth Sciences, The University of Hong Kong, Hong Kong SAR, People's Republic of China. Tel.: +852-2857-8278; fax: +852-2517-6912.

E-mail addresses:[email protected], [email protected] (H. Li).

(2)

not be necessarily appropriate, and may actually de-grade the head and parameter estimation [2,15]. This is the ®rst motivation of this paper.

Bentley [2] formed a penalty method least-squares objective function by adding a general positive de®nite quadratic form of the discrete governing equation re-siduals into the traditional objective function and min-imized the objective function without any constraints using a ``damped direction-search and linesearch al-gorithm''. Then he suggested at the end of his paper on p. 147 that

Convergence, as well as selecting an optimum solu-tion technique, remain topics for future investiga-tion.

Hence, it is necessary to ®nd an eective algorithm to minimize the penalty method objective function. This is the second motivation of this paper.

The algorithm proposed in this paper can be brie¯y introduced as follows.

First, dierent from [2], residuals of the discrete governing equations are scaled by multiplying them with a ``scaling matrix'' determined adaptively by the current coecient matrix of the discrete governing equations. With the scaled residuals, a similar penalty method objective function to that in [2] is formed. Detailed nu-merical analysis shows that the discrete governing equation residuals are not distributed uniformly on the discrete nodes due to the wide value range of the discrete governing equation coecients and their ``built-in'' but

Notation

A _n_n_{coecient matrix of the discrete}

aquifer model (see (9), (10a)±(10c)).

C _n_n_p_n_n_p _precondition

matrix.

D_c2 _{a diagonal form of}C _{de®ned by (37)}

and (38).

H the piezometric head or watertable vector (see (5)).

hx;y the piezometric head or watertable,m.

J _n_n_m_n_p_n_n_p_{Jacobian matrix}

de®ned by (32) and (39).

Jg weighted square sum of the discrete

governing equation residuals (see (15) and (20)).

Jgmp the objective function of the penalty

method de®ned in (19).

Jm weighted square sum of the head

observation residuals (see (13) and (17)).

Jp weighted square sum of the

par-ameter prior information residuals (see (14) and (18)).

Jt traditional objective function de®ned

in (16).

n the number of the discrete nodes. nm the number of observation wells.

np the number of the unknown

par-ameters (see (4a), (4b) and (11)).

P _n_n_p Jacobian matrix de®ned in (39), (41) and (42).

P np-dimensional unknown parameter

(log-transimissivity) vector de®ned by (11).

Rg n-dimensional discrete governing

equation residual vector de®ned by (15).

Rgm an nm-dimensional vector composed

of the components of Rg whose

in-dices equal those of the nodes co-incident with the head observation wells.

Rgr annÿnm-dimensional vector

com-posed of the rest of the components ofRg except those ofRgm.

Rm nm-dimensional head observation

residual vector de®ned by (13). Tx;yandTi unknown transmissivity, m2_{/day (see}

(1), (3), (4a), (4b) and (11)).

U _n_nscaling matrix (see (15)).

U_E a diagonal form of U de®ned by (25).

U₁ a diagonal form ofUde®ned by (24).

W block diagonal matrix Diagk2_gW_g_; W_m_;_k_pW_p_.

W_g _n_n_{weighting matrix of the residual}

vectorRg.

W_m _n_m_n_m _{weighting matrix of the}

residual vectorRm.

W_p _n_p_n_p _weighting _matrix _of _the

parameter prior information residual vectorRp.

Yi 16i6np log-transimissivity (YilogTi).

Cd andCn the Dirichlet and Neumann

boundaries.

DP the dierence vector between the computed and true values of the parameter vectorP.

k2_g weighting factor of the penalty term Jg(see (19) and (20)).

kp weighting factor of the regularization

termJp (see (16) and (18)).

v nnp-dimensional vector de®ned byHT_;_PTT

(3)

irregular weighting eects on these residuals. This ir-regularity of the residuals can be eciently improved by using the above mentioned scaling matrix.

Second, based on the generalized Gauss±Newton method [11,12,21], every iteration minimizing the pen-alty method objective function is formulated with a linear least-squares system whose unknowns are the corrections of both the unknown parameters and the groundwater heads on all the discrete nodes. The num-ber of linear equations contained in this linear system has to be more than or at least equal to the number of unknowns in it. So the number of observation wells has to be more than or at least equal to the number of un-known parameters when there is no regularization term containing prior information on the unknown par-ameters in the objective function.

Finally, a suitable algorithm is chosen to solve this linear least-squares system which is usually ill-con-ditioned. There are several sources of this ill-conditioned state, e.g., the ill-posedness of the inverse aquifer problems themselves, the wide value range of the dis-crete governing equation coecients mainly caused by the non-uniformity of the triangulation and the wide range of the transimissivity values, the deterioration of the condition number of the least-squares systems due to large weighting factor kg of the penalty term used to

enforce the discrete governing equations, etc. Moreover, this linear least-squares system is a large sparse system incorporating the discrete governing equations. In order to overcome these diculties, an iterative least-squares QR factorization (LSQR) algorithm for sparse linear least-squares problems [17,18] is chosen to solve it. At the same time, a simple but eective preconditioner is implemented to improve the ill-conditioned state.

Numerical results demonstrate that if either the dis-crete governing equation residuals are not scaled or no preconditioning technique is adopted, then the penalty method algorithm performs poorly due to the ill-con-ditioned state, e.g., the computation cost is higher than that of the traditional method, and what is worse is, the calculations fail to converge for some initial values of the unknown parameters. But the situation can be im-proved signi®cantly by some naturally chosen diagonal forms of the scaling matrix and the preconditioner.

Due to the sparse iterative data structure of the LSQR algorithm, the size of the discrete governing equations of our algorithm can be as large as that of the traditional methods.

In every Gauss±Newton iteration of our algorithm, one needs to solve the linear least-squares system only once. In comparison, every Gauss±Newton iteration of the traditional methods has to solve the discrete gov-erning equations as many times as one plus the number of unknown parameters if one uses the parameter per-turbation and numerical dierence quotient method, or as many times as one plus the number of head

obser-vation wells if one uses the state adjoint sensitivity equation method (variatonal method) [21]. The total number of iterations required to minimize the penalty method objective function is less than or equal to the number required for the two traditional methods tested. All these facts demonstrate the potential of this method to solve inverse problems of more complicated non-linear aquifer models naturally and quickly on the basis of ®nding suitable forms of the scaling matrix and the preconditioner.

The paper is organized as follows. In the next part, the penalty method objective function and its minimiz-ing algorithm based on the Gauss±Newton method are given in detail both for a linear aquifer model and a non-linear model described by the elliptic Boussinesq equa-tion. In Section 3, two numerical examples are given. One of the examples is taken from [2,6]. The other is the elliptic Boussinesq equation describing an uncon®ned aquifer [1,19]. Through these numerical examples, the eects and utility of both the scaling matrix and the preconditioner are illustrated. At last, the conclusions are drawn on the basis of the former sections and problems that remain to investigate are stated.

2. Algorithm

2.1. Objective function of penalty method

The inverse problems of typical groundwater ¯ow equations in con®ned and uncon®ned aquifers will be used as examples to illustrate the algorithm of the least-squares penalty method. Consider two-dimensional steady ¯ow described by

ÿr sTrh ÿQBhÿhr 0; x;y 2X 1

in inhomogeneous, isotropic aquifers subject to bound-ary conditions

hhdx;y; x;y 2Cd; 2

sToh

o_nqx;y; x;y 2Cn; 3

where

X: Groundwater ¯ow region. It is a given bounded horizontal domain inx;yplane.

Cd and Cn: The Dirichlet and Neumann boundaries

ofX, respectively.Cd[CnoX:

x;yand r: Spatial variables and the gradient operator.

hx;y: Groundwater head of a con®ned aquifer or watertable of an uncon®ned aquifer. Unknown. Only ®nite measurement values at given observation wells are available.

(4)

problem. For uncon®ned aquifer s hÿhbx;y

=M [1,19], wherehb denotes the given elevation of

the aquifer bottom and it has the same datum plane ash, andMis a positive constant describing the av-erage thickness of the uncon®ned aquifer. In this, case (1) represents the elliptic Boussinesq equation and (1)±(3) becomes a non-linear problem.

Tx;y: Transmissivity of the aquifer. Unknown. In this paper it is assumed thatTis piecewise constant in X, i.e., there arenp given subdomains (zones) Xi

whereTi16i6npare unknown positive constants. Qx;y: Recharge term determined by factors such as source, sink, precipitation and evapotranspira-tion etc.

Bx;yand hr: Speci®ed leakage parameters. We

assume B60 for the sake of the well-posedness of (1)±(3).

o₌on: Normal derivative with respect toCn.

hdx;y: Speci®ed head or watertable onCd.

qx;y: Speci®ed ¯ux onCn.

The discrete governing equations of (1)±(3) are the starting point of our method. One can discretize gov-erning equations (1)±(3) using various methods such as ®nite-dierence, ®nite-volume or ®nite-element method according to one's preference. In this paper we use linear triangle ®nite element method to discretize (1)±(3). Let

fxi;yi;16i6ng be the position of the nodes of the triangulation, and /ix;y16i6n be the linear basis function corresponding to the ith discrete nodexi;yi. Let

H: h1;. . .;hn T

5

be the head or watertable vector, where hi denotes the ®nite element approximation tohxi;yi, superscript ``T'' indicates the transpose of a vector or matrix. Then the recharge term Q, the leakage parameters B andhr,

are constants on each triangle elemente, i.e.,

Yij_eYe; Qj_e Qe; Bj_eBe; hrj_eher: 8

By applying the standard Galerkin ®nite element pro-cedure, the discrete governing equations of (1)±(3) can be written in matrix form as follows (see e.g., [19]):

A_H_;_P_H_F_; ₉

whereAis annn matrix with elements

aijsijbij; 16i; j6n 10a

positive weighting constant that is chosen large enough compared with all the other diagonal entries ofA_{for the}

discrete Dirichlet boundary conditions to be enforced at least as accurately as other discrete governing equations.

eis the triangle element,T_i_{is the set of all the triangle} elements in the neighborhood of the ith node

xi;yi16i6n), i.e., T_i_f_e_j_xi_;_yi_{is a vertex of} _e_g_.

F is an n-dimensional vector determined by forcing terms such as Q,B,hr and the boundary conditions of

(1)±(3), whose components can be written as

fi wdhdxi;yi; i2 gate gradient method or the Cholesky decomposition method is used to solve the discrete aquifer model (9), otherwise s hÿhb=M, (9) is a non-linear problem and it is solved by iterative method.

As in [2], the observation head residual vectorRmand

parameter prior information residual vector Rp can be

de®ned by

RmH HsÿHm; 13

RpP PÿP; 14

whereHs andHm arenm-dimensional vectors whose

re-spective components denote the simulated and measured head values at the observation wells;nmis the number of

(5)

RgH;P UA_H_;_P_H_ÿ_F_; ₁₅

whereU_{is an}_n_n_{given non-singular ``scaling'' matrix}

used to scale the elements of Rg. Later we will discuss

how to choose it. Then the traditional least-squares method for aquifer parameter identi®cation can be written as [4±6,8]

matrix determined by the measurement errors of the head, W_p _{is a given} _n_p_n_p _{symmetric semi-positive}

de®nite matrix determined by the errors of the prior information onP,kp is a non-negative weighting factor

used to measure the exactness and reliability of the prior information on the unknown parameters compared to those of the information on the models and head measurements.

The least-squares penalty method can be written as

min

is the penalty term to enforce the discrete governing equations,W_g_{is a symmetric positive de®nite weighting}

matrix, and kg is a weighting factor to control the

en-forcement of the discrete governing equations. In this paper we will setW_gto beI_n, i.e., thenth-order identity matrix.

The least-squares penalty method objective function in [2] is the special case of (19) when UI_n. Such a choice ofUwill cause non-uniformity of the components ofRgbecause the matrixAin (15) has irregular weighting

eects due to the great value range in the elements ofA_.

In order to analyse the weighting eects ofA_whenU

andW_g_equalI_n_{, we de®ne 0}_<_r₁6r26 6rnas the eigenvalues of the matrixATA _and_Vi ₍₁6i6n) as the corresponding unit orthogonal eigenvectors. Then

V_:_V₁_;_{. . .}_;_Vn ₂₁

is an orthogonal matrix. De®neRh :HÿAÿ1F as the

computational residual vector between the heads of least-squares method and those of direct method. Then from (15), (20) and (21) we have

Eq. (22) implies that the weighting factor of the ith component of VT_R_h is k2gri, 16i6n. Due to the

non-uniformity of the spatial discretization and the large range of the transimissivityTvalues, the value range in the elements of A_{is great. This will result in large}

con-dition number ofATA, namely the ratiorn=r1[20]. For

instance, in the simple synthetic examples in [2], our calculations show that the largest and smallest diagonal elements ofATAare respectively 4:5105_{and 37.5 with}

the latter corresponding to the node at the ``northeast corner'' of the domain in the example (Fig. 1 in [2] or this paper). It is well known thatr1is less than or equal

to the smallest diagonal element of ATA_and_r_n_greater

than or equal to its greatest diagonal element. Hence the ratio rn=r1P4:5105=37:51:2104. This explains

the interesting phenomenon in the numerical examples in [2] that the node with the largest residual is always at the northeast corner of the square domain, because its weighting factor corresponds to the smallest eigenvalue of ATA_.

Hence we reach the following conclusion: only by minimizing (19) with a scaling matrix U that can ap-propriately rectify the non-uniform or irregular weighting eects of ATA on the components of the re-sidual vector Rg, one can get a scaled ``stable'' residual

vector Rg independent of the wide value range of the

entries of Aand make it possible to exploit the deeper potential information contained in Rg. Moreover, the

scaling matrix will improve the illcondition caused by the wide value range of the entries ofA_.

Therefore, a suitable scaling matrix U _{is necessary,}

especially for realistic problems with a wide range of entries in A_{. In this paper, three simple forms of the}

scaling matrix U _{are given. Their eects will be shown}

and compared in the next section. These forms are: (a) the identity matrix, i.e.,

UI_n_; 23

where there is no ``scaling'' or improvement as in [2]; (b) the in®nite row norm form

UU₁_:Diag aÿ111;. . .;aÿ1nn

; 24

where Diag. . .indicates a diagonal matrix, andaiiis theith diagonal element ofA_;

(c) the Euclidean row norm (2-norm) form

UU_E_:Diag sÿ1

Although the above forms ofU _{given by (24) and (25)}

(6)

scaling the irregular weighting eects of A _{with the}

simplest calculation without changing the original pur-pose of the least-squares penalty method. This means that the penalty termJgis corrected adaptively at every

iteration according to the change of the entries of A

mainly caused by changes in the parameter vectorP.

2.2. Algorithm

Let

v v1;. . .;vn;. . .;vnnp T

: HT;PTT: 27

Then the objective function Jgmp de®ned by (19) is a

non-linear function of the variable vector v. The minimization must be carried out iteratively. Let iit be

the iteration index andv0_{be a given initial estimate of}_v_.

At every iteration we wish to determine a correction_Dv

of the current iterative value ofv, i.e.viit_{, such that}

viit1_viit_D_v_; _i

it0;1;2;. . . 28

results in a reduction of the objective functionJgmp. The

Gauss±Newton method for obtaining Dv leads to the following linear system of equations [11,12,21]

JTWJ_Dv ÿJTW_R_; 29

where

WDiagk2gWg;Wm;kpWp 30

andRis the entire residual vector composed of Rg, Rm

andRp, i.e.,

J _{is the Jacobian or sensitivity matrix when}_v_viit_with

elements

cause the node numbernis usually very large. Hence it will be time-consuming and require substantial com-puter-storage to solve (29) directly. Note that (29) is equivalent to the following linear least-squares system

min

square root or Cholesky decomposition of the positive de®nite matrixW_{, i.e.,}

WTW W_: 34

Instead of solving (29) directly, we consider the linear least-squares system (33) which can be solved using the iterative LSQR method [17,18] or the direct QR fac-torization method [10,12]. In our numerical tests both methods are tried. For relatively small n (e.g., 6100)

both work well. When n is larger, the direct QR fac-torization method needs more storage and its speed is comparable to the iterative method. Hence the iterative LSQR method is prefered, because it can use a sparse matrix data structure to store the non-zero elements of

J. Consequently, the node numbern can be as large as that of the traditional method.

In order to solve (33) eciently and accurately, a suitable preconditioning technique is needed. LetCbe a suitable nnp nnp non-singular matrix which can improve the condition number of the matrixW JC _. Then instead of solving (33) directly, we ®rst solve the following system about an nnp-dimensional un-known vector n

The simplest form ofCis a diagonal matrix. We wish to improve the condition number of W JC by choosing a diagonal matrix C_:D_c2 such that all the diagonal el-ements of D_c2JTWJD_c2 is equal to 1. This leads to the tests this simple procedure accelerates the convergence of the iterative LSQR algorithm signi®cantly and it also widens the range of initial values of the unknown parameter P within which our algorithm can converge (see Section 4, Table 1).

The main work in implementing the above algorithm is to calculateJ_{. By calculating directly using (4a), (4b),}

(5)±(9), (10a)±(10c), (11)±(15), (27), (31) and (32), setting the matrixU_{according to the entries in}A_{of the current}

iteration and regarding U _{as a constant matrix}

(7)

wheres0

j os=ohxj;yj;hj, andNi is the serial num-ber set of all the nodes in the neighborhood of the ith nodexi;yi. For linear aquifer modelsZ₀because of s1. M_w is an nmn matrix whose ith row is an n

-dimensional vector with itsnith component being 1 and the others 0. Here ni16i6nm indicates the serial number of the node located at theith head observation well.I_n_p is thenpth-order identity matrix.Pis annnp

matrix with elements

pij X

k2N_i

o_aik

o_Yjhk

0 if i2D ln 10P

k2N_i

hk P e2Ti\T_j

R e10

Ye

ejsr/_i r/_k dxdy else; (

41

where

ej:oY

e

o_Yj

1 wheneXj; 0 whene6Xj;

16j6np 42

and Xj is de®ned by (4a) and (4b). Here it is assumed that each triangle element eis contained by one of the subdomainsX1;. . .;Xnp.

Based on the above discussion, the penalty method algorithm can be described as follows:

1. Initialization. Choose a suitableP0as the initial value

ofP. The initial valueH0ofHis obtained by solving

(9) when PP0. Set v0 H0T;P0T T

, J00. Choose

two suitably small numbers dv>0 and dJ >0 as the convergence criteria of the Gauss±Newton itera-tion. Choose two suitable np-dimensional vectors

Pÿ _and _P _{as the lower and upper bounds of the}

parameter vectorP.

2. Foriit0;1;2;. . .repeat steps 3±5.

3. Implement Gauss±Newton method.

(a)Calculate Jacobian matrixJ_{. Calculate}_Jij_j vvi_it 16i6nnmnp;16j6nnp using (39)±

(42), while the matrix U _{in (39) is determined}

by the current entries inA. (In numerical exam-ples three forms of U de®ned by (23)±(26) are used and their utilities are compared.)

Table 1

Comparison of the eects of dierent residual-scaling matrixUand the preconditionerCon the computational costa Method Uniform initial trasmissivityT(m2_/day)

1.0 10.0 100.0 200.0

(1)UI_n_; CI_n_n

p

CPU time (s) 1 1 134.4 177.0

GN Lsqr Slsqr Failed Failed 9(174)3342)16360 11(151)3079)21557

(2)UIn; CDc2

CPU time (s) 1 71.1 51.5 55.0

GN Lsqr Slsqr Failed 14(475)710)8659 11(374)719)6264 12(180)714)6695

(3)UU₁; CInnp

CPU time (s) 1 51.4 33.0 47.0

(4)UUE; CInnp

CPU time (s) 1 51.1 32.8 47.0

(5)UU₁_; CD_c2

CPU time (s) 32.6 25.8 16.3 23.5

GN Lsqr Slsqr 18(199)279)3974 15(205)231)3142 10(187)206)1980 14(197)207)2846

(6)UU_E_; CD_c2

CPU time (s) 31.5 25.5 17.9 23.0

GN Lsqr Slsqr 18(193)265)3840 15(202)228)3104 11(183)203)2175 14(195)203)2798

(7) B & Y:

CPU time (s) 18.6 16.7 19.5 23.0

BY 12 11 13 15

(8)C&N:CN 71 54 57 90

a

GN: total Gauss±Newton iterations.Lsqr: LSQR iterations within the Gauss±Newton iteration. It diers for dierent Gauss±Newton steps, hence only its range is listed in parentheses. Every iteration of LSQR includes 3nnpnm 5nnpmultiplications and two products between matrices and vectors de®ned byW JM nandW JM T

(8)

(b)CalculateDv. CalculateDvby solving the lin-ear least-squares system (35) and (36) using itera-tive LSQR algorithm [17,18].

(c) Updatev. Calculateviit1 _{using (28).}

4. Verify the convergence. Calculate the value of Jiit1:Jgmp using (19). If

jJiit1ÿJiitj<dJ or jDvj1<dv 43

holds, stop, wherejDvj₁indicates the in®nity norm of

Dv. else set iit:iit1.

5. Implement the upper and lower bound constraints to P. LetPiit1_{be the last corresponding}_n

pcomponents of

viit1_{. Judge if} _Pÿ₆_Piit1₆_P _{holds, where the}

in-equality refers to the corresponding components of the three vectors. If there are components of Piit1

which are less than their lower bounds or greater than their upper bounds, then set them to equal their lower or upper bound values, respectively. Then set the last correspondingnp components ofviit1 to equal those

ofPiit1 _{and go to step 3.}

3. Numerical examples

3.1. Example of linear aquifer model

In order to demonstrate the utility of our algorithm, the steady-state test problem used by Carrera and Neuman in [6] and Bentley in [2] will be solved. It can be described by specifying the factors in the aquifer model (1)±(3) as follows (see [2,6]):

X. A square domain of 6000 m6000 m (see Fig. 1).

Cd and hd.Cd is the southern side of the square

do-main, hd100 m.

Cnand q. The other three sides of the domain areCn.

On the western side the ¯uxq0:25 m2_{/day, on the}

northern and eastern sides there is no ¯ow, i.e., q0:

TransmissivityTx;y. The domainXis divided into np9 constant-transmissivity zones of

2000 m2000 m with the corresponding transmis-sivity values Ti16i69 ranging from 5 to 150 m2_{/day (see Fig. 1).}

sx;y;h,Bx;yandhr. The aquifer is assumed to be

con®ned, i.e., sx;y;h 1. The stratum overlying and underlying the aquifer are supposed to be im-pervious, i.e., Bx;y 0. HenceBhr0.

Qx;y. In the upper half part of the northern three zones Q0:27410ÿ3 _{m/day, in the lower part of}

them Q0:13710ÿ3 _{m/day, in other six zones}

Q0.

Head measurements. They are available at the 18 ob-servation wells shown as points in Fig. 1.

In computation, the ¯ow region is discretized into 1212 squares, each of which is divided into two

tri-angle elements. All the head observation wells coincide with the discrete nodes. In order to obtain head mea-surements at observation wells, as in [2,6], the ``true'' head values at discrete nodes, which range from 100.0 m on the southern Direchlet boundary to 205.88 m at the north-east corner, were numerically calculated using the true transmissivity values shown in Fig. 1. Then head values at the 18 observation wells were corrupted by uncorrelated Gaussian noise of variance 1. Conse-quently the head measurement weighting matrix W_m in (17) is set to be thenmth-order identity matrix as in [2,6].

To test the utility of our algorithm, the regulation termJpcontaining prior information onPis not used in

all our calculation, i.e.,kp0. ButP is restricted from

below and above by choosing Pÿ _{with uniform}

com-ponents log 0:01 ÿ2 andP_{with uniform components}

log 103_{3. The Gauss±Newton iteration stop criteria}

are chosen to be dJ 10ÿ7anddv10ÿ4.

The numerical tests are composed of two parts. The ®rst part is designed to investigate the eects of dierent forms of the residual scaling matrixU_{and the}

preconditionerC _{on the calculation speed and}

numeri-cal results. For this purpose, three dierent forms ofU, i.e., I_n, U₁ and U_E de®ned by (23)±(26), two dierent forms ofC, i.e.,I_nnp andD_c2 de®ned by (37) and (38), and four dierent but uniform initial values of Ti16i6np9 as in [6], namely, 1, 10, 100 and

200 m2_{/day, are used combinatorially. The weighting}

factor kg is respectively ®xed to be 1:0 when UIn

considering the great equivalent weighting eects ofATA

and to be 10.0 for the other two forms of U_{de®ned by}

(24)±(26) because of their counteracting the weighting eects of ATA_. _Hence _there _are _altogether

42324 dierent cases.

(9)

In Table 1 the relevant data on the computational costs corresponding to the 24 cases are summarized.

The ®rst eight rows corresponding to methods 1±4 show that if either the discrete governing equation re-siduals are not scaled (i.e.,UI_n), or no precondition technique is adopted (i.e., CI_nn

p), our algorithm

performs poorly, e.g., the computation cost is higher than that of the traditional method, and what is worse is the Gauss±Newton iterations fail to converge for some initial values of the unknown parameters. But the situ-ation is always improved with the gradual improvement ofU _andC_.

The successive four rows corresponding to methods 5 and 6 demonstrate the improvements due to setting

UU₁_orU_E _{(Eqs. (24)±(26)) and}CD_c2 _{(Eqs. (37)}

and (38)). The range of initial values ofTfor which the Gauss±Newton iteration converges has increased, there are no longer ``Failed'' calculations for all the four dif-ferent initial values ofT; and the computational eort is also signi®cantly reduced. For example, when the initial values of Pare uniform log 100 m2_/day, _S

lsqr, the

num-ber of LSQR iterations is reduced from 16 360 for

UI_n and CI_nnp to 1980 for UU₁ and

CD_c2. And the CPU time is reduced proportionally. In the next two rows that begin with ``B & Y'' we show the computational eort of the method described by Becker and Yeh [3,21], which uses a Gauss±Newton method to minimize the objective functionJtde®ned by

(16) means of by the parameter perturbation and nu-merical dierence quotient method. Comparison of the CPU times and the number of iterations required by the ``B & Y'' method (i.e.,BY) to the results from the im-proved scaling matrixes demonstrates that the compu-tational cost of our algorithm is reduced to a level similar to that of the traditional method. This trend shows potential of our algorithm to solve inverse problems more quickly than the traditional method on the basis of ®nding more suitable forms ofU andC.

In the last row,CN, the number of iterations required by the ``conjugate gradient algorithm coupled with Newton's method'' used by Carrera and Neuman in [6] to minimize the traditional objective functionJtde®ned

by (16), is quoted. One can see thatGN, the number of Gauss±Newton iterations required by our algorithm, is less thanCN. Unfortunately, the computational costs of our algorithm and that used in [6] in every iteration minimizing the objective functions are dicult to com-pare due to the lack of relevant data.

The ®ve Failed cases in Table 1, which mean that the iterations fail to converge, can be improved so that the iterations converge if the Marquardt's method [7,12] is incorporated in our algorithm. But it takes more itera-tions and CPU times than those of the other unfailed cases listed in Table 1 where the Marquardt's method is not used. The detail will not be stated here due to space limitation.

In Table 2 the numerical results corresponding to the 24 cases mentioned above and to the traditional method by Becker and Yeh [3,21] are summarized. Theoretically, (i.e., if there are no round o errors,) dierent forms of non-singular matrix C do not in¯uence the calculated results. The numerical results support this fact: the op-timization results neither depend on the initial values of

T, nor on the change of C up to ®ve signi®cant ®gures. Hence, only results corresponding to the three dierent forms of U are listed in Table 2, and they are valid for all the combinations of dierent initial values of Tand forms of C _{in Table 1 which are not Failed. The last}

column is the results calculated by using the traditional method of Becker and Yeh.

The ®rst four rows in Table 2 give the values ofJt, the

traditional objective function that appeared in (19), of Jg, the weighted square sum of the discrete governing

equation residuals de®ned by (20), of Jgmp, the penalty

method objective function de®ned in (19) and of the ratio Jg=Jgmp successively. It is well-known that greater

the weighting factors of residuals in a least-squares system, smaller these residuals will be. Hence the ratio Jg=Jgmp can be regarded as a measure of the weighting

eects of kgU on the discrete governing equation

resid-uals. According to this measure the weighting eects of kgU10:0UE are the weakest, and those of

kgU1:0In the strongest.

The data listed under ``Identi®ed T '' and ``The Statistic Data of logT'' demonstrate that the numerical results ofTdier very slightly when they are calculated by using our algorithm with the three dierent choices of

kgU and when they are obtained by the traditional

method of Becker and Yeh. For example, rDP, the standard deviation of the residuals between the com-puted and true log-transmissivity, ranges only from 2:7810ÿ3 _{to 3}_:₃₃₁₀ÿ3_{, and SSE}

DP, the standard square error between the computed and true log-trans-missivity, from 2:8110ÿ3_{to 3}_:₃₄₁₀ÿ3_.

One may conjecture that the discrete governing equations will be less accurately satis®ed at the nodes coincident with the head observation wells than at other nodes due to the measurement error at the head obser-vation wells. This conjecture can be seen and understood clearly if ``The statistic data of Rgm and Rgr'' are

com-pared row by row, where Rgm is an nm-dimensional

vector composed of the components ofRgwhose indices

equal those of the nodes coincident with the head ob-servation wells andRgris annÿnm-dimensional vector

composed of the rest of the components of Rg except

those of Rgm. For example, SSERgm, the standard

square error ofRgm, is much greater than SSERgrfor

all the three choices of kgU. The ratio SSERgm/

SSERgr ranges from 2.16 to 9.69. The other pair of statisticsjRgmj₁andjRgrj₁, the respective in®nity norms ofRgmandRgr, also show the similar phenomenon with

(10)

is to say, statistically, the residuals of the discrete gov-erning equations discretized at the head-observation-well nodes are much greater than those of the discrete governing equations at other nodes. Therefore, the above-mentioned conjecture is true in the sense of statistics.

If ``The statistic data of Rgm andRgr'' are compared

column by column, the following interesting facts can be found: among the three group values listed in the three columns, the values ofk2_gSSERgmandk2_gSSERgrin the ®rst column corresponding tokg U1:0In are always

the smallest, but the values of jRgmj₁ andjRgrj₁ in the ®rst column are always the biggest. These results imply that the components of the residual vector Rg when UI_n are irregularly distributed and they have been improved whenUis set to be U₁ orU_E.

The second part of our numerical tests are designed to investigate the eects of the weighting factorkgon the

numerical results. To this end the initial values ofTare ®xed to be uniform 100 m2_/day, _U_U

E andCDc2,

then our algorithm is used to minimize the penalty method objective functionJgmp corresponding to

dier-entk2_g, which is respectively set to be 10:0; 102_; ₁₀4_; ₁₀6

and 108_{. The results are listed in Table 3.}

First let us look at the data listed above ``The com-putation cost'' in Table 3. If they are compared column by column, the following facts can be seen:

1. Ask2_gincreases from 10.0 to 106_{, the values of}_J t and

Jgmp tend to the same constant 16.2 increasingly, and

the values ofJgdecreases with an approximate rate of

kÿ2_g . So the ratioJg=Jgmpdecreases also with the same

approximate ratekÿ2_g . In fact, it has been theoretically proven that Jt, Jgmp and Jg have the asymptotic

properties

lim

kg!1Jtkg!1lim JgmpC1; kg!1lim k 2

gJgC2;

where C1is the value ofJt corresponding to the

tra-ditional-method solution to problem (16),C2is a

non-negative constant that vanishes if and only if C1

vanishes [13]. The numerical results in Table 3 show these properties clearly when k2g6106 with the

con-stants C1 and C2 approximating 16.2 and 94:1,

re-spectively. Unfortunately, by k2g108, they begin to Table 2

Comparison of the numerical results of our algorithm and of the traditional methoda

U I_n U₁ U_E Traditional

kg 1.0 10.0 10.0 method

Jt 16.0 14.8 14.5 16.2

Jg 0.115 0.674 0.839 0.0

Jgmp 16.1 15.5 15.3

Jg=Jgmp 0.715% 4.35% 5.47% 0.0%

True T Identi®edT (m2_/day)

T1150 158.71 159.10 158.51 160.66

T2150 137.20 134.04 133.32 136.71

T350 51.608 51.667 51.652 51.679

T4150 123.02 120.77 120.18 123.25

T550 49.756 49.973 50.016 49.758

T615 15.516 15.482 15.474 15.511

T750 66.537 67.600 67.651 67.256

T815 15.028 14.998 14.991 15.025

T95 5.0254 5.0329 5.0329 5.0310

The statistic data oflogT

EDP 5.8910ÿ3 _4.90₁₀ÿ3 _4.24₁₀ÿ3 _7.05₁₀ÿ3

rDP 2.7810ÿ3 _3.25₁₀ÿ3 _3.33₁₀ÿ3 _2.96₁₀ÿ3 SSEDP 2.8110ÿ3 _3.27₁₀ÿ3 _3.34₁₀ÿ3 _3.01₁₀ÿ3

The statistic data ofRgmand Rgr k2

gSSERgm 1.3110ÿ3 2.0110ÿ2 2.5010ÿ2 0.0

k2

gSSERgr 6.0510ÿ4 2.0810ÿ3 2.5810ÿ3 0.0

SSERgm=SSERgr 2.16 9.66 9.69

jRgmj1 0.140 0.0396 0.0373 0.0

jRgrj1 0.0458 0.0170 0.0190 0.0

jRgmj1=jRgmj1 3.06 2.33 1.96

a_C_{and the initial values of}_T_{can be any one of the combinations in Table 1.}

DPis the dierence vector between the computed and true values of the parameter vectorPde®ned by (11).Rgmis annm-dimensional vector composed of the components of the discrete governing equation residual vector

Rg (see (15)) whose indices equal those of the discrete nodes coincident with the head observation wells.Rgr is annÿnm-dimensional vector composed of the rest of the components ofRgexcept those ofRgm. For an arbitrarym-dimensional vector: e1;. . .;em

T

, which can be set to beDP

or Rgm or Rgr, respectively, the vector functionsE, r, SSE andj j1 are de®ned as follows: E: 1=m Pm

i1ei indicates the mean; r: 1=mPm

i1eiÿE 2

is the standard deviation; SSE: 1=mPm

(11)

lose part of their ®delity due to the serious deterio-ration of the condition number of the weighting matrixW de®ned by (30).

2. When k2_g10:0, Jg takes comparatively large value

4.06 and the ratio Jg=Jgmp reaches 36.9%, implying

that much of the head measurement noise is ``ab-sorbed'' by the discrete governing equation residuals. This leads to the large errors in the estimated values of Ti16i69 (see the second column listed under

k2_g10:0). When k2_gP102_{, accurate and stable}

esti-mates ofTi16i69are obtained due to the sucient enforcement of the discrete governing equations, even in the case ofk2_g108_{. They also dier very slightly}

from each other. Moreover, if one compares our re-sults ofTi16i69when 104₆_k2

g610

8_{with the}

tra-ditional method results ofTilisted in the last column of Table 2, one can easily see that eachTi16i69

always has the same ®rst three signi®cant ®gures with only one exception ofT767:5 whenk2g108.

3. The facts stated in the above paragraph are also sup-ported by the data listed under ``The statistic data of logT''.

In the last two rows the ``CPU Time'' and Slsqr, the

sum of LSQR iterative numbers in all the Gauss±New-ton iterations, are listed for dierent values ofk2_g. One may see that although they dier from each other, the average CPU time is only 18.9 s, which is a little less than 19.5 s, the CPU time of the traditional method corresponding to the method 7 B & Y and initial T 100 m2_{/day in Table 1. Particularly, when}_k2

g104,

the CPU time of our algorithm is only 11.1 s, 56.9% of that of the traditional method.

In short, there is a suciently wide workable range of

kg values within which the penalty method algorithm

performs well enough to give accurate estimations of the unknown parameterTwith low computational cost just like the traditional method.

3.2. Example of non-linear aquifer model described by Boussinesq equation

In order to demonstrate the advantage of our algo-rithm that it can solve inverse problem of non-linear aquifer models naturally and quickly, an uncon®ned quasilinear aquifer model described by the Boussinesq equation [1,19] is arti®cially designed by specifying the factors in the aquifer model (1)±(3) as follows:

1. The aquifer-choice function sx;y;h 1=M hÿhbx;y, where hb: ÿ1=400xy, M :

160:0 m.

2. The leakage parameters Bx;y: ÿ5:010ÿ7

dayÿ1_; _h

r0 m.

3. All the other factors, such asX,Cd andhd,Cn andq,

transmissivity Tx;y determined by the average thickness of the uncon®ned aquifer, rechargeQx;y

and watertable measurement locations are the same as those described in the example of Section 3.1 and Fig. 1.

As before, the ¯ow region X is discretized into 1212 squares each of which is divided into two

Table 3

Comparison of the eects of dierent weighting factorkgon the numerical resultsa k2

g 10:0 10

2 ₁₀4 ₁₀6 ₁₀8

Jt 6.95 14.5 16.2 16.2 16.2

Jg 4.06 8.3810ÿ1 9.3910ÿ3 9.4110ÿ5 6.6410ÿ1

Jgmp 11.01 15.3 16.2 16.2 16.9

Jg=Jgmp 36.9% 5.47% 0.058% 0.000581% 3.94%

T1150 130.14 158.51 160.68 160.70 160.76

T2150 104.24 133.32 136.69 136.72 136.71

T350 50.335 51.652 51.675 51.675 51.677

T4150 94.447 120.18 123.12 123.15 122.96

T550 50.690 50.016 49.754 49.752 49.756

T615 15.254 15.474 15.511 15.511 15.511

T750 70.870 67.651 67.326 67.325 67.495

T815 14.779 14.991 15.024 15.024 15.021

T95 5.0383 5.0329 5.0311 5.0311 5.0311

The statistic data oflogT

EDP )2.8510ÿ2 4.2410ÿ3 7.0110ÿ3 7.0310ÿ3 7.0910ÿ3

rDP 9.4410ÿ3 _3.33₁₀ÿ3 _2.95₁₀ÿ3 _2.95₁₀ÿ3 _2.99₁₀ÿ3 SSEDP 1.0310ÿ2 _3.34₁₀ÿ3 _3.00₁₀ÿ3 _3.00₁₀ÿ3 _3.04₁₀ÿ3

The computation cost

CPU time (s) 14.7 17.9 11.1 25.4 25.5

GN Lsqr Slsqr 12(138)156)

1782

11(183)203)

2175

7(175)205)

1354

14(104)325)

3089

20(71)506)

3097 a

(12)

triangle elements. In order to obtain watertable mea-surements at observation wells, the true watertable val-ues at discrete nodes were numerically calculated using the true transmissivity values shown in Fig. 1. Then the watertable values at the 18 observation wells were cor-rupted by uncorrelated Gaussian noise of variance 1.

W_m,kp,Pÿ andP,dJ anddv are also the same as the

preceding example. The numerically calculated true watertable in X ranges from 100.0 m on the southern side to 168.43 m at the northeast corner ofX.

In Table 4 numerical results of T corresponding to two dierent methods are compared. In the ®rst column the true transmissivities of the nine zones are listed in parentheses. The second column is the results of the penalty method algorithm when UU_E_, CD_c2 _and

kg10:0. The third column is the results of the

tra-ditional method described by Yeh and Becker in [3] and [21]. One can see that the two results are almost equal and they are very close to the true values ofT. The data of EDP, rDPand SSEDP show that there are no

signi®cant dierences between the two sets of values of identi®ed T.

In Table 5 the computational costs of the above mentioned two methods are compared. The ®rst two rows are the results of the penalty method algorithm, the next two rows the results of the traditional method by Becker and Yeh.

From Table 5 one can see the following facts: 1. For the non-linear Boussinesq aquifer model and the

four dierent initial uniform values of Ti16i69, bothSlsqr, the sum of LSQR iterative numbers in all

the Gauss±Newton iterations, and the CPU time of our algorithm do not increase signi®cantly compared with those when the aquifer model is linear, i.e., those corresponding to method 6 in Table 1. For one case both even decrease, but the CPU time of the tra-ditional method by Becker and Yeh increases by 9±12. 16 times compared with that when the aquifer model is linear, i.e., that corresponding to ``B & Y'' in Table 1. On the other hand, the comparison of the CPU times of the two methods in Table 5 shows that our algorithm runs 6.5±8.6 times as fast as the traditional method. These facts demonstrate the po-tential of our algorithm to solve the inverse problems of more complicated non-linear aquifer models natu-rally and quickly on the basis of ®nding suitable forms of U _and C_{. The main reason of these facts}

is that the computational procedure of our algorithm is the same regardless of the linearity or non-linearity of the aquifer models with respect to h.

2. Unfortunately, our algorithm Failed when the initial values of T are uniform 1 m2_{/day. How to improve}

this by looking for more suitable forms of U _andC

or other techniques remains interesting open prob-lem. However, the incorporation of the Marquardt's method [7,12] in our algorithm can improve the cal-culation so that the iteration converge. But it takes more iterations and CPU times than those of the other three unfailed cases listed in the ®rst two rows in Table 5 where the Marquardt's method is not used. The detail will not be stated here due to space limitation.

Table 4

Comparison of the numerical results of our penalty method algorithm and of the traditional method by Becker and Yeh for an uncon®ned quasilinear aquifer modela

Methods Penalty method (kgU10UE; CDc2)

B & Y (traditional)

T1150 154.70 157.71

T2150 115.71 120.98

T350 53.099 53.425

T4150 128.80 133.71

T550 54.394 53.967

T615 15.758 15.796

T750 49.628 58.607

T815 14.923 15.119

T95 5.0132 5.0164

Statistic data oflogT

EDP )9.5310ÿ3 4.0710ÿ3

rDP 2.1110ÿ3 _2.08₁₀ÿ3 SSEDP 2.2010ÿ3 _2.10₁₀ÿ3 a_{The initial values of}

Tcan be any one of those in Table 5. The signs

EDP, rDP and SSEDP have the same meanings as those in Table 2.

Table 5

Comparison of the computational cost of our algorithm and of the traditional method of Becker and Yeh for an uncon®ned quasilinear aquifer modela

Method Uniform initial transmissivityT(m2_/day)

1.0 10.0 100.0 200.0

Penalty (kgU10UE;CDc2)

CPU time (s) 1 22.0 31.0 35.0

B & Y (traditional)

CPU time (s) 244.0 190.0 201.0 230.0

BY 16 13 14 16

a

(13)

4. Conclusion

The least-squares penalty method objective function described in [2] is generalized by scaling the discrete governing equation residuals with a scaling matrix U

determined by the current coecient matrix of the dis-crete governing equations adaptively. The utilities of three dierent simple forms of U are shown and compared.

The generalized Gauss±Newton method [11,12,21] is used to minimize the generalized penalty method objective function. Every Gauss±Newton iteration is formulated by a linear least-squares system whose unknowns are the corrections of both the parameters and the groundwater heads on all the discrete nodes. An iterative LSQR algorithm for sparse linear least-squares problems [17,18] is chosen to solve this least-squares system. Due to the sparse iterative data structure of the LSQR algorithm, the size of the discrete governing equations can be as large as that of the traditional method. A simple but ecient preconditioner C is proposed to improve the condition number of the least-squares system.

Numerical results show that even simple and natural diagonal forms ofU and C have signi®cant eects on: (1) widening the range of the initial values of the un-known parameters within which the minimizing itera-tion can converge; (2) reducing the computaitera-tional cost in every Gauss±Newton iteration; (3) improving the ir-regular distribution of the discrete governing equation residuals on the nodes.

The reducing trend of computational cost with the improvement of U _and C _{and the suciently wide}

range of the workable values of kg shown in

nu-merical tests demonstrate adequate potential of our algorithm to perform better and faster than the tra-ditional method on the basis of ®nding suitable forms of U and C. Moreover, because the computation procedure of our algorithm is the same regardless of the linearity or non-linearity of the aquifer models with respect to h, numerical example shows that it can solve the inverse problem of a quasilinear Boussinesq aquifer model almost as fast as that of linear ones. This evidence indicates some promise of our algorithm to solve inverse problems of more complicated non-linear aquifer models naturally and quickly.

All the above mentioned potential of our algorithm are ultimately dependent upon the ®ndings of more suitable forms of the scaling matrix U, of the precon-dition matrix C and of the weighting matrixW_g. How to look for them according to the respective distribu-tions of the discrete governing equation residuals and the discrete aquifer model errors on dierent nodes, to the data structure of A_{, remains interesting open}

problems.

Acknowledgements

Part of this work was done when the author vis-ited the Institute of Mathematics, Computer Science Department of the Federal Armed Forces University in Munich and Chair of Applied Mathematics, Technical University of Munich. The author thanks Prof. Dr. Ulrich Hornung, Prof. Dr. K.-H. Homann and Dr. T. Lohmann for their instruction, Dr. Ste-phan, Dr. Youcef, Miss Christa and Dr. Evelina for their enthusiastic help. The author is also grateful to the public ``Meschach Library'' which is available over Internet/AARnet via netlib, or at the anony-mous ftp site thrain.anu.edu.au in the directory pub/ meschach. This library provides us a complete set of C routines for performing calculations on matrices and vectors. This work was partly supported by the fund of ``The research of the groundwater in An-shan'' awarded by the science and technology com-mittee of Anshan City. Many thanks to Prof. M.B. Allen and other anonymous referees for their instructions.

References

[1] Bear J. Hydraulics of groundwater. New York: McGraw-Hill, 1979.

[2] Bentley LR. Least squares solution and calibration of steady-state groundwater ¯ow systems. Adv Wat Res 1993;14:137±48. [3] Becker L, Yeh WW-G. Identi®cation of parameters in unsteady

open-channel ¯ows. Wat Resour Res 1972;8:956±65.

[4] Carrera J, Neuman SP. Estimation of aquifer parameters under transient and steady-state conditions: 1. Maximum likelihood method incorporating prior information. Wat Resour Res 1986;22:199±210.

[5] Carrera J, Neuman SP. Estimation of aquifer parameters under transient and steady-state conditions: 2. Uniqueness, stability, and solution algorithm. Wat Resour Res 1986;22:211±27.

[6] Carrera J, Neuman SP. Estimation of aquifer parameters under transient and steady-state conditions: 3. Application to synthetic and ®eld data. Wat Resour Res 1986;22:228±42.

[7] Charles LL, Richard JH. Solving least squares problems. Phila-delphia, PA: SIAM, 1995. p. 337.

[8] Chavent G. On the theory and practice of non-linear least-squares. Adv Wat Res 1991;14:55±63.

[9] Cooley RL. Incorporation of prior information on parameters into non-linear regression groundwater ¯ow models, 1. Theory. Wat Resour Res 1982;18:965±76.

[10] Golub GH, van Loan CF. Matrix computations. Baltimore, MD: Johns Hopkins University Press, 1983. p. 476.

[11] Heinkenschloss M, Sachs EW. The role of growth rates for Gauss±Newton methods and parameter identi®cation problems. In: IFAC Symposium on Control of Distributed Parameter Systems, Perpignan, June 1989.

[12] Kool JB, Parker JC. Analysis of the inverse problem for transient unsaturated ¯ow. Wat Resour Res 1988;24:817±30.

(14)

[14] Lu AH, Schmittroth F, Yeh WW-G. Sequential estimation of aquifer parameters. Wat Resour Res 1988;24:670±82.

[15] McLaughlin D, Townley LR. A reassessment of the ground water inverse problem. Wat Resour Res 1996;32:1131±96.

[16] Neumann SP, Yakowitz S. A statistical approach to the inverse problem of aquifer hydrology, 1. Theory. Wat Resour Res 1979;15:845±60.

[17] Paige CC, Saunders MA. LSQR: an algorithm for sparse linear equations and sparse least-squares. ACM Trans Math Software 1982;8:43±71.

[18] Paige CC, Saunders MA. LSQR: sparse linear equations and least-squares problems. ACM Trans Math Software 1982;8:195± 209.

[19] Pinder GF, Gray WG. Finite element simulation in surface and subsurface hydrology. New York: Academic Press, 1977. [20] Wilkinson J. The algebraic eigenvalue problem. London: Oxford

University Press, 1965.