9 Nico Acta 6 multiple sources

(1)

ON THE SMOOTHING PARAMETER IN CASE OF DATA

FROM MULTIPLE SOURCES

by

Breaz Nicoleta

Abstract. In this paper we focuse on data smoothing by spline function. The smoothing parameter λ that is involved in this kind of modeling is obtained here from the generalized cross validation (GCV) procedure. Also for data from two sources with different weights is already known a GCV formula for parameter λ given in a particular case when the smoothing function is a function on the circle. We extend this formula to a more general case and in the same time for more than two sources.

1. INTRODUCTION

We consider a regressional model written with n observational data

( )

x i n

f

y_i = _i +ε_i, =1, where xi∈

[ ]

0,1 ,

(

n

)

N

( )

I

2 2

1,ε ,...,ε ~ 0,σ ε

ε= ′ , a gaussian n dimensional vector with zero mean and σ2I matrix of covariances. About the regression function we just know the information that f is in some space Wm defined as

[ ]

{

( ) ( )

}

2 1 _absolutely_continuous_,

,..., , : 1 ,

0 f f f f f L

W

Wm= m = ′ m− m ∈ .

Then, we can speak about a spline smoothing problem. So we obtain an estimate of f by finding fλ∈Wm to minimize

( )

[

]

(

( )

)

_, ₀

1 1

1

0

2

2₊ _>

−

∑

₌n λ

∫

λ

i

m i

i f x f u du

y

n . (1)

It is known that the solution of this problem is the natural polynomial spline of degree 1

2m− with knots xi,i=1,n.

For the beginning we consider the smoothing parameter λ fixed. Then we can search a solution for (1) in a certain subspace of W_m, spaned by n appropriate chosen basis functions. According to [2] such basis functions are related to B-spline but here we are not interested in this. We just consider f of the form

∑

₌ = n

k k kB

c f

(2)

Now we rewrite the problem in matriceal form using the following notations:

(

)

′

= y y yn

y 1, 2,...,

(

)

′

= c c cn

c 1, 2,...,

( )

ij j

( )

i n

j n i

ij B B x

B

B= =

≤ ≤≤ ≤ , 1

1 .

Also, from [2] we know that we can write the seminorm

( )

_f

(

_f( )

( )

_u

)

_du

J =

∫

m 1

0

2

in matriceal form c′Σc for some matrix Σ. Then the problem is to find c to minimize y−Bc 2+λc′Σc. The solution for this problem is given as

(

BB n

)

By

c= ′ + λΣ −1 ′ . When the smoothing parameter λ is too small we obtain some function f which is close to data despite of its smoothness and when λ is too big we obtain some function f which is very smooth but is not sufficient close to data.

Among the methods which provide an optimal λ from the data are the(cross validation)CV and (general cross validation) GCV procedures described in [2].

According to CV method, λ is the minimizer of the expression

( )

∑

(

[ ]

( )

)

= −

= n

k

k k

k f x

y n CV

1

2 1

λ λ

with f_λ[ ]k the spline estimate using all data but the k-th data point of y. As a generalization, GCV procedure use a more general function

( )

(

( )

)

( )

(

)

2

1 1

  

 ₋ − =

λ λ λ

A I Tr n

y A I n GCV

where A

( )

λ is the influence or hat matrix given by the relation

( )

x

f_λ

  

(3)

From [2] we know that A

( )

λ has the form A

( ) (

λ =B B′B+nλΣ

)

−1B′. Also we remind that an estimate for σ2 is given by

(

( )

)

( )

(

λ

)

λ σ

A I Tr

y A I

− − =

2 2

ˆ .

In the problem formulated here the data came from a single source. In [1], Feng Gao formulate this problem for two sources and give a similar GCV method for

λ. Gao consider a particular case when f is a function on the circle.

In this paper we extend the results of Gao for more than two sources and in the same time we consider a more general case for domain of function f.

2.MAIN RESULTS

We consider the case when the data came from l sources with unknown weights as in

( )

, 1, ,

.. ... ... ... ...

, 1 ,

2 2

1 1

l li

li li

i i i

N i x

f y

N i x

f y

N i x

f y

= + =

= +

=

= +

=

ε ε ε

n N N

N1+ 2 +...+ l = ,

with residual gaussian vectors defined as

(

)

( )

(

)

( )

(

)

N

( )

I

I N

l lN

l l l

N N

l

2 2

1

2 2 2

22 21 2

2 1 1

12 11 1

, 0 ~ ,...,

,

... ... ... ... ... ...

, 0 ~ ,...,

,

, 0 ~ ,...,

,

2 1

σ ε

ε ε ε

σ ε

ε ε ε

σ ε

ε ε ε

′ =

We assume that σ₁2,σ₂2,...,σ_l2 are unknown and ε1,ε2,...,εl are independent.

(4)

( )

(

)

(

( )

)

( )

(

y

f

x

)

J

( )

f

.

...

x

f

y

x

f

y

N

...

N

min

l m

N

i

li li

l

N

i

i i

N

i

i i

l W

f

λ

+







−

σ

+







−

+

σ

+

−

σ

+

∑

=

= =

∈

1

2 2

1

2 2 2

2 2 1

2 1 1

2 1 2

1

1 2

We search for f∈Wm of the form

∑

=

= n

k k kB

c f

1

and we use the notations

(

)

(

)

(

)

(

)

′

=

′ =

n lN l l l

N N

c c c c

y y y y

l

,..., ,

... ... ... ...

,..., ,

2 1

2 22 21 2

1 12 11 1

2 1

( )

ij j

( )

li n

j N i ij l

i j ij n j

N i ij

i j ij n j

N i ij

x B B B

B

x B B B

B

x B B B

B

l =

=

= =

≤ ≤≤ ≤

,

... ... ... ... ...

, ,

1 1

2 1

1 2

1 1

2 1

Now we have l matrices B. The model becomes

( )

I i l N

c B y

i i

l l l

, 1 , , 0 ~

. ... ...

2 1 1 1

= +

= + =

σ ε

ε ε

and the variational problem is now equivalent to find c the minimizer of

(

r y Bc r y B c r y Bc c c

)

n − + − + + l l − l + ′Σ

⋅ α

θ

2 2

2 2 2 2 1 1

1 ...

1

(5)

l

σ σ σ θ = ₁ ₂...

(

... 1

)

, 1 , ... ...

2 1 1

1 2

1 = =

= − +

l i

l i i

i i l rr r

r

σ σ σ σ

σ σ

n

l⋅ ⋅

=σσ σ λ α 1 2... . We can prove the following proposition:

Proposition 1. For fixed r =

(

r1,r2,...,rl

)

and α the solution of the variational

problem (2) is

 ′ +

 ′ + ′ +



 ′ + ′ + + ′ + Σ

= l l l − l l l

r rB B rB B rB B rB y rB y rB y

cˆ ... ₁ ₁ ₁ ₂ ₂ ₂ ...

1 2

2 2 1 1 1

,α α (3)

Proof. We denote by E the expression that has to minimize so we have the condition 0

= ∂ ∂

c E

. This condition is equivalent with

0 ...

1 1 1 1 1

1 ′ + ′ − − ′ + ′ + Σ =

−rB y rB Bc rlBl yl rlBl Blc α c

and the solution of this matriceal equation is

 ′ +  ′ + ′ + 

 ′ + ′ + + ′ + Σ

= rB B r B B rlBl Bl − rB y r B y rlBl yl

c ... ₁ ₁ ₁ ₂ ₂ ₂ ...

1 2

2 2 1 1

1 α .

According to formula (3) it is important to get some estimates for

  

 ₌

− −

1 2 1 1

2 1

... 1

and ,..., ,

l l

l

r r r r r

r r

r and α . Using the similar methods with [1] we give

a GCV function that we can use for estimating

r

and α in the same time. We introduce the notations

′   ′ ′ ′ =

l l

B B B B

y y y y

,..., ,

2 1

(4)

(6)

( )

         

 

         

 

=

− − −

l l

N l N

l N

N

I r r r I

r I

r

r I

1 2 1 1

2 1

... 0

... 0 0 0 0

0 1

... 0 0 0 0

... ...

... ... ... ... ...

0 0

... 0 0 1

0

0 0

... 0 0 0 1

1 2

1

(5)

Also we use the notations

( )

r B I

B

y r I y

B B r B

B r B B r M

r r

l l l

⋅ =

  ′ + ′ + + ′ + Σ =

− −

1 1

2 2 2 1 1

1 ... α

(6)

We can prove the following proposition:

Proposition 2. The influence matrix Ar

( )

r,α defined by

( )

r

r r

y r A

yˆ = ,α (7)

has the form

( )

    

 

    

 

′ ′

′

′ ′

′

′ ′

′

=

− −

−

− −

−

− −

−

l l l l

l l

l

l l

r

B M B r B

M B r r B M B r r

B M B r r B

M B r B M B r r

B M B r r B

M B r r B

M B r

r A

1 2

1 2 1

1 1

1 2 2 2

1 2 2 1 1 2 2 1

1 1 1 2

1 1 2 1 1

1 1 1

...

... ...

... ... ,α

or

( )

= r − r′

r

B M B r

A ,α 1 .

(7)

( )

r r

r r r

y r A y B M

B = ⋅

  

 ′

⋅ −1 _,_α and further

( )

= r − r′

r

B M B r

A ,α 1 .

In order to get a GCV formula for r and α, first we construct for our case a CV-like formula.

Now we denote by c[ ]rk,α, the spline estimate of c, using all but the k-th data point of y; also we denote by yk the k-th data point and by

k

B and Bk,r the k-th row of B and Br respectively.

Then the (cross validation) CV function for our model would be

( )

∑

(

[ ]

)

= −

= n

k

k r k k B c

y n r CV

1

2 , 1

,α _α

Next we prove the corresponding leaving-out-one lemma for our case that is: Lemma 3. Let be h_r_,_α

[ ]

k,z the solution to the variational problem

(

r y Bc r y Bc c c

)

n l l l

c θ⋅ − + + − +α ′Σ

2 2

1 1

1 ...

1 min

with the k-th data point y_kr replaced by z. Then we have [ ]

[

]

[ ]k

r k r r k

r k B c c

h_,_α , , _,_α = _,_α

Proof. We have

[ ] [ ]

(

)

(

[ ]

)

( )

[ ] [ ]

[ ]

(

)

( )

[ ] [ ]

∑

≠

= α α α

≠

= α α α

α α

≤ ′

α + −

=

= ′

α + −

+ −

n

k i i

k , r k , r k

, r r , i r i

n

k i i

k , r k

, r k

, r r , i r i k

, r r , k k

, r r , k

c c c

B y

c c c

B y c

B c B

1

2 1

2 2

Σ

(8)

(

)

[ ]

(

)

(

)

( )

n

n k i i r i r i r , k k , r r , k n k i i r , i r i R c , c c c B y c B c B c c c B y ∈ ∀ ′ α + − + − ≤ ≤ ′ α + − ≤

∑

≠ = α ≠ =

Σ

1 2 2 1 2

Based on the leaving-out-one lemma now we can prove the following theorem: Theorem 4. We have the identity:

( )

(

( )

[

( )

]

( )

)

( )

[

( )

]

( )

∑

₌ ₋− −₋       _⋅ ₋ ′ − ⋅ = n k k r k r k r I r A I r I y r I r A I r I n r CV 1 2 1 1 2 1 1 , , 1 , α α α

with Ik

( )

r 1

− _{, the k-th row of the matrix}_I−1

( )

_r .

Proof. We consider the following identity

[ ] [ ] α α α α , , , , , , r r k r k r r k r k k r k k k r k k c B y c B y c B y c B y − − ⋅ − =

− . (8)

Further we have

[ ]

[ ]k r k k r k s k s r r k r k k r k k c B y c B r y r c B y c B y k k α α α α , , , , , − − ⋅ − = − where

{

}

{

}

{

}

      + ∈ + ∈∈ = − l l k N N k l N N k N k s ,..., 1 if ... ... ... ... ,..., 1 if 2 ,..., 2 , 1 if 1 1 2 1 1 .

Moreover, we can write [ ]

[ ] [ ]k r k k k r r k r r k s r r k r k k r k k c B y c B c B r c B y c B y k α α α α α , , , , , , , , − − − − =

(9)

[ ]

[

]

[ ]

[

]

[ ] =

− − =

− −

k r k k

k r r k r r k r

r k r r k

k r k k

k r r k r r k

c B y

c B k h B c B k h B

c B y

c B c B

α

α α

α α α

,

, , , , ,

, , ,

, , , , ,

, ,

[ ]

[

[ ]

]

[ ]

k

s k r r k r k

k r r k r r k r k r r k

r c B y

c B k h B y k h B

α

α α

α

, ,

, , , , ,

,

, ,

− − =

Because Bk,rc_r_,_α is linear in each data point, we can replace the divided difference below by a derivative. So we can write that

[ ]

[ ] r sk

k r r k

k r k k

k r r k r r k

r y

c B

c B y

c B c B

⋅ ∂ ∂ = −

− α

α α

α ,

,

, , , , ,

. (10)

Moreover, we have

kk r

k r r k

a y

c B

= ∂ ∂ ,α

,

, (11)

the kk-th entry from Ar

( )

r,α matrix.

So, according to (8), (9), (10) and (11) we can write [ ]

(

a

) ( )

I r

c B y c

B y

kk kk

r r k r k k r k

k ₁

, ,

,

1− ⋅ −

− =

− α α .

Moreover,

( )

r y

[

I A

( )

r

]

I

( )

r y

A y c B

y k kr

r r

k r k r r k r k

1 ,

, ,

, = − _, ⋅ = − _, ⋅ −

− α α α

where Ak,r

( )

r,α ,I_k is a k-th row of matrix Ar

( )

r,α and I respectively. So we have

[ ]

[

( )

]

( )

a I

( )

r y r I r A I c B y

kk r kk r k k k r k

k ₁

1 ,

,

1 ,

− −

⋅ −

=

− α α .

After multiplying the numerator and the denominator by Ikk

( )

r 1

(10)

( )

(

)

( )

(

( )

)

( )

_ 

 ₋ _⋅ ′

= ′

− − −

=

− −

∑

Tr I r I A r I r

n r I r A I r I n

r n

k

k r

k

1 1

1

1 ₍ _, ₎ 1 _,

1 _α _α

and we have a GCV-like function

( )

[

( )

]

( )

(

( )

)

1

( )

2

1

2 1 1

, 1

,

  



  − ⋅ ′

− =

− −

r I r A I r I Tr n

y r I r A I r I n r

GCV

r r

α α α

This function is a generalization for the GCV

( )

λ function from [2] (one source) and the GCV

( )

r,α function from [1] (two sources).

REFERENCES

[1] Gao Feng, On combining data from multiple sources with unknown relative weights, Technical Report, no. 894/1993, University of Wisconsin, Madison

[2] Wahba Grace, Spline models for observational data, SIAM, Philadelphia, CBMS-NSF Regional Conference, Series in Applied Mathematics, vol. 59.

9 Nico Acta 6 multiple sources

ON THE SMOOTHING PARAMETER IN CASE OF DATA

1. INTRODUCTION

2.MAIN RESULTS

Proof. We denote by E the expression that has to minimize so we have the condition 0

Proof. We have

REFERENCES

Author: