Partially Observed Functional Data Without Measurement Error

3 Estimation Methods

3.1 Partially Observed Functional Data Without Measurement Error

In the scenario that functional curves are partially observed on the domain without measurement error, to get an estimator ofγ in model (1), we need to get estimators ofU_ij andφ_j pertaining to this case. An estimator ofU_ij is obtained based on the linear functional of the observed partX_iO_i, and an estimator ofφ_j is obtained by giving estimators of mean and covariance function ofX. The steps are given here.

Step 1: Estimate the meanμand the covariance functionc_Xby sample mean and sample covariance.

Step 2: Estimate eigenvalues {λ_j}and eigenfunctions {φ_j}by

Tcˆ_X(s, t )φˆ_j(s) ds= ˆλ_jφˆ_j(t ).

Step 3: Estimate principal component scoresU_ij =U_{ij O}_i+U_{ij M}_i withUˆ_{ij O}_i = XiO_i− ˆμO_i,φˆj O_i, and estimateUij M_i by modeling it as linear functionals of X_iO_i given asUˆ_{ij M}_i = ˆξ_{ij M}_i, X_iO_i− ˆμ_O_i.

Step 4: Estimate γ based on formulas (2) and (3) for X_iO_i observed without measurement error.

We first address the problem of getting estimators of μ and c_X, denoted as ˆ

μ^NME and cˆ^NME_X , respectively, followed by establishing estimators of U_ij and eigenfunctions φ_j which are denoted as Uˆ_ij^NME and φˆ_j^NME. For simplicity of presentation, we suppress the notation on “NME” in this subsection unless otherwise stated.

Let O_i(t ) = IOi(t ) with indicator function IOi(t ) being 1 if t ∈ O_i, and 0 otherwise, and letW_i(s, t ) =O_i(s)O_i(t ). The estimators of the mean functionμ and the covariance functionc_XofXobtained from the observed pointss, tofX_i, are given by,

μ(t )= 1

#_n

i=1O_i(t ) n i=1

O_i(t )X_i(t ), (4)

c_X(s, t )= 1

#_n

i=1W_i(s, t ) n i=1

W_i(s, t )(X_i(s)− ˆμ(s))(X_i(t )− ˆμ(t )). (5)

Therefore, we get the estimators {ˆλ_j}, { ˆφ_j} related to {λ_j} and {φ_j} from cˆ_X associated with the covariance operatorCˆX.

We could not get estimators Uˆij of FPC scores {Uij}of Xi directly from its definition ifOi =T. To bridge the gap,Uij is decomposed into two parts:

Uij = XiO_i−μO_i, φj O_i + XiM_i−μM_i, φj M_i =Uij O_i+Uij M_i, (6) whereμO_i andφj O_i denote the restriction ofμand the eigenfunction φj onOi, respectively, and the definitions ofμ_M_i,φ_{j M}_i are similar. The estimatorUˆ_{ij O}_i of U_{ij O}_i can be estimated directly from the observed partX_iO_i and the estimatorφˆ_j, given asUˆij O_i = XiO_i − ˆμiO_i,φˆj O_i. For the termUij M_i, we consider using the linear functional formξij M_i, XiO_i−μO_iof the observed partXiO_i to estimate it which is also considered in Kraus (2015), that is,

ξˆ_{ij M}_i =argmin

ξ_{ij Mi}∈L²

n⁻¹ n

i=1

(Uˆ_{ij M}_i− ξ_{ij M}_i, X_iO_i − ˆμ_iO_i)²

with Uˆij M_i = XiM_i − ˆμM_i,φˆj M_i. The estimator ξˆij M_i has the explicit form:

ξˆ_{ij M}_i = ˆC_O⁻¹

iO_iCˆ_O_i_M_iφˆ_{j M}_i, where Cˆ_O_i_O_i, Cˆ_O_i_M_i are the empirical covariance operator forC_O_i_O_i,C_O_i_M_i with the kernel being the covariance functioncˆ_XofX_i restricted toO_i ×O_i andO_i ×M_i, respectively. To obtain a stable solution, we adopt ridge regularization, given by

ξˆ_{ij M}^(ρ)

i =(Cˆ^(ρ)_O

iO_i)⁻¹CˆO_iM_iφˆj M_i, Uˆ_{ij M}^(ρ)

i = ˆξ_{ij M}^(ρ)

i, X_iO_i− ˆμ_iO_i, i=1,· · ·, n, j =1,· · · , m, (7) whereCˆ_O^(ρ)

iO_i = ˆC_O_i_O_i+ρFOi,FOi is an identity operator defined onL²(O_i),ρis a ridge parameter; see Kraus (2015) for further details. LetUˆ_ij^NME= ˆUij O_i+ ˆU_{ij M}^(ρ)

i. The estimatorγˆ^NMEofγusing all of the information of the dataset is then obtained through replacingUˆ_ij in (2) withUˆ_ij^NME,

γ^NME(t )= m j=1

γjφˆj(t ). (8)

To facilitate our theoretical analysis, we first impose some assumptions on observation points for partially observed functional curves, indicating the observation points asymptotically provide enough information in individual or pairwise crossover.

(A1) There existsδ1>0 s.t. sup

t∈[0,1]P{n⁻¹#n

i=1IO_i(t )≤δ1} =O(n⁻²).

(A2) There existsδ₂>0 s.t. sup

s,t∈[0,1]²P{n⁻¹#_n

i=1W_i(s, t )≤δ₂} =O(n⁻²).

Moreover, we also introduce some regularity conditions necessary to derive theoretical properties for the estimateγˆ^NME.

(A3) E||X−μ||⁴<∞. (A4) nm⁻¹→ ∞,n/(#_m

j=1δ_j⁻²)→ ∞withδ_j =minj≥1{λ_j−λ_j₊₁, λ_j₋₁−λ_j} andnλ²_m→ ∞asm→ ∞.

(A5) The ridge parameterρsatisfiesρ→0,nρ³→0,nm⁻¹ρ²→ ∞.

(A6) #_∞

k=1[E[Y U_k]]²/λ²_k <∞.

(A7) #_∞

j=1

#_∞

k=1 r²

Mi Oi jk

λ²

Oi Oi k

<∞,withr_M_i_O_i_{j k} =cov(X_M_i−μ_M_i, φ_M_i_M_i_j,X_M_i− μ_M_i, φ_O_i_O_i_k).

Assumption (A3) is a common condition in the analysis of functional model by using the method of FPCA to guarantee the random functions have finite fourth moment (see Cardot et al. 1999). Note that if the eigenvalues {λ_j} are exponentially or geometrically decreasing, the assumption (A4) holds. The same kind of conditions are also introduced in Cardot et al. (1999). Assumption (A5) is used to control the size of ridge effect. To define the convergence of the right hand of the formulaγ (s)=#_∞

k=1(E[Y U_k]/λ_k)φ_k(s), in theL²sense, assumption (A6) is required that is similar to the condition (A1) in Yao et al. (2005b). Assumption (A7) is used to make the solutionξˆ_{ij M}_i valid which is commonplace in the theory of inverse problems as Picard condition (see Hansen1990).

Letθ_n = #_∞

k=m[E[Y U_k]]²/λ²_k. Then assumption (A6) indicates thatθ_n → 0.

Denote υ = #m

j=1V_ij withV_ij = φ_{j M}_i, (C_M_i_M_i −C_M_i_O_iC_O⁻¹

iO_iC_O_i_M_i)φ_{j M}_i. Based on the above assumptions, Theorem 1 gives the converge rate for the estimatorγˆ^NMEin theL²sense.

Theorem 1 Suppose that (A1)–(A7) are satisfied. Then

ˆγ^{N ME}−γ ²=Op(n⁻¹mρ⁻²+ιn+θn+υ), withιn=n⁻¹#m

j=1δ⁻_j².

Theorem1indicates that the approximation error rate ofγˆ^NMEforγis controlled by four terms. The first term depends on sample sizen, tuning parameterm, ridge parameterρ, which is of the higher order than the one given in Hall and Horowitz (2007) that is mainly due to functional curves observed on the part of the domain.

The second term is related to the spacings between adjacent eigenvalues, and its effect on convergence rate ofγ is also emphasized in Hall and Horowitz (2007).

The third term is related to the convergence ofγ inL²sense, which is also shown in Yao et al. (2005b) to get approximation error rate for functional coefficient. The fourth term is introduced by approximatingU_{ij M}_i withU˜_{ij M}_i.

Note that in practice, the ridge parameterρincluded in the regularized estimation of thejth score of theith functional observation is chosen by generalized cross- validation based on the set of samples observed on the entire domain (see Kraus 2015).

3.2 Partially Observed Functional Data with Measurement

Dalam dokumen (ICSA Book Series in Statistics) Wenqing He, Liqun Wang, Jiahua Chen, Chunfang Devon Lin - Advances and Innovations in Statistics and Data Science-Springer (2022) (Halaman 153-156)