arXiv:1404.7080v1 [math.ST] 28 Apr 2014
A test for the equality of covariance operators
Graciela Boente, Daniela Rodriguez and Mariela Sued
Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires and CONICET, Argentina
e–mail: gboente@dm.uba.ar drodrig@dm.uba.ar msued@dm.uba.ar
Abstract
In many situations, when dealing with several populations, equality of the covariance operators is assumed. An important issue is to study if this assumption holds before making other inferences. In this paper, we develop a test for comparing covariance operators of several functional data samples. The proposed test is based on the squared norm of the difference between the estimated covariance operators of each population. We derive the asymptotic distribution of the test statistic under the null hypothesis and for the situation of two samples, under a set of contiguous alternatives related to the functional common principal component model. Since the null asymptotic distribution depends on parameters of the underlying distribution, we also propose a bootstrap test.
1
Introduction
In many applications, we study phenomena that are continuous in time or space and can be considered as smooth curves or functions. On the other hand, when working with more than one population, as in the finite dimensional case, the equality of the covariance operators associated with each population is often assumed. In the case of finite-dimensional data, tests for equality of covariance matrices have been extensively studied, see for example Seber (1984) and Gupta and Xu (2006). This problem has been considered even for high dimensional data, i.e., when the sample size is smaller than the number of variables under study; we refer among others to Ledoit and Wolf (2002) and Schott (2007).
recently, Panaretoset al. (2010) considered the problem of testing whether two samples of continuous zero mean i.i.d. Gaussian processes share or not the same covariance structure.
In this paper, we go one step further and consider the functional setting. Our goal is to provide a test statistic to test the hypothesis that the covariance operators of several independent samples are equal in a fully functional setting. To fix ideas, we will first describe the two sample situation. Let us assume that we have two independent populations with covariance operators Γ1 and Γ2. Denote byΓb1 and Γb2 consistent estimators of Γ1 and Γ2, respectively, such as the sample covariance estimators studied in Dauxois et al. (1982). It is clear that under the standard null hypothesis Γ1 = Γ2, the difference between the covariance operator estimators should be small. For that reason, a test statistic based on the norm of Γb1−Γb2 may be helpful to study the hypothesis of equality.
The paper is organized as follows. Section 2 introduce the notation and review some basic concepts which are used in later sections. Section 3 introduces the test statistics for the two sample problem. Its asymptotic distribution under the null hypothesis is established in Section 3.1 while a bootstrap test is described in Section 3.2. An important issue is to describe the set of alternatives that the proposed statistic is able to detect. For that purpose, the asymptotic distribution under a set of contiguous alternatives based on the functional common principal component model is studied in Section 3.3. Finally, an extension to several populations is provided in Section 4. Proofs are relegated to the Appendix.
2
Preliminaries and notation
Let us consider independent random elements X1, . . . , Xk in a separable Hilbert space H
(oftenL2(I)) with inner producth·,·iand normkuk=hu, ui1/2 and assume thatEkX
ik2< ∞. Denote by µi ∈ H the mean of Xi, µi = E(Xi) and by Γi : H → H the covariance
operator of Xi. Let ⊗ stand for the tensor product onH, e.g., for u, v ∈ H, the operator
u⊗v:H → His defined as (u⊗v)w=hv, wiu. With this notation, the covariance operator
Γi can be written as Γi=E{(Xi−µi)⊗(Xi−µi)}. The operator Γi is linear, self-adjoint
and continuous.
In particular, ifH=L2(I) andhu, vi=RIu(s)v(s)ds, the covariance operator is defined through the covariance function of Xi, γi(s, t) = cov(Xi(s), Xi(t)), s, t ∈ I as (Γiu)(t) =
R
Iγi(s, t)u(s)ds. It is usually assumed that kγik2 =
R
I
R
I γi2(t, s)dtds < ∞ hence, Γi
is a Hilbert-Schmidt operator. Hilbert–Schmidt operators have a countable number of eigenvalues, all of them being real.
Let F denote the Hilbert space of Hilbert–Schmidt operators with inner product de-fined by hH1,H2iF = trace(H1H2) = P∞ℓ=1hH1uℓ,H2uℓi and norm kHkF =hH,Hi1F/2 =
{P∞ℓ=1kHuℓk2}1/2, where{uℓ :ℓ ≥1} is any orthonormal basis ofH, whileH1,H2 andH are Hilbert-Schmidt operators, i.e., such that kHkF <∞. Choosing an orthonormal basis {φi,ℓ : ℓ ≥ 1} of eigenfunctions of Γi related to the eigenvalues {λi,ℓ : ℓ ≥ 1} such that
λi,ℓ≥λi,ℓ+1, we getkΓik2F =P∞ℓ=1λi,ℓ2 . In particular, ifH=L2(I), we havekΓikF =kγik.
or not. For that purpose, let us consider independent samples of each population, that is, let us assume that we have independent observations Xi,1,· · ·, Xi,ni, 1 ≤ i ≤ k, with
Xi,j ∼Xi. A natural way to estimate the covariance operatorsΓi, for 1≤i≤k, is through
their empirical versions. The sample covariance operator Γbi is defined as
b
Γi=
1
ni ni
X
j=1
Xi,j−Xi⊗ Xi,j−Xi ,
where Xi = 1/niPnj=1i Xi,j. Dauxois et al. (1982) obtained the asymptotic behaviour of
b
Γi. In particular, they have shown that, whenE(kXi,1k4)<∞,√ni
b
Γi−Γi
converges in distribution to a zero mean Gaussian random element of F, Ui, with covariance operator Υi given by
Υi =
X
m,r,o,p
simsirsiosipE[fimfirfiofip]φi,m⊗φi,r⊗˜φi,o⊗φi,p
− X
m,r
λimλirφi,m⊗φi,m⊗˜φi,r⊗φi,r (1)
where ˜⊗stands for the tensor product in F and, as mentioned above, {φi,ℓ :ℓ ≥1} is an
orthonormal basis of eigenfunctions ofΓiwith associated eigenvalues{λi,ℓ :ℓ ≥1}such that
λi,ℓ≥λi,ℓ+1. The coefficients sim are such that s2im=λi,m, whilefimare the standardized
coordinates of Xi −µi on the basis {φi,ℓ : ℓ ≥ 1}, that is, fim = hXi −µi, φi,mi/λ
1 2
i,m.
Note that E(fim) = 0. Using that cov(hu, Xi−µii,hv, Xi−µii) = hu,Γivi, we get that
E(f2
im) = 1, E(fim fis) = 0 for m6=s. In particular, the Karhunen-Lo´eve expansion leads
to
Xi =µi+
∞
X
ℓ=1
λ 1 2
i,ℓfiℓφi,ℓ . (2)
It is worth noticing thatEkUik2F <∞so, the sum of the eigenvalues ofΥiis finite, implying
thatΥi is a linear operator overF which is Hilbert Schmidt. Thus, any linear combination
of the operators Υi, Υ = Pki=1aiΥi, with ai ≥ 0, will be a Hilbert Schmidt operator.
Therefore, if{θℓ}ℓ≥1 stand for the eigenvalues of Υordered in decreasing order,θℓ≥0 and
P
ℓ≥1θℓ<∞. This property will be used later in Theorem 3.1.
When H=L2(I), smooth estimators, Γbi,h, of the covariance operators were studied in Boente and Fraiman (2000). The smoothed operator is the operator induced by the smooth covariance function
b
γi,h(t, s) =
1
n1
ni
X
j=1
Xi,j,h(t)−Xi,h(t) Xi,j,h(s)−Xi,h(s) ,
where Xi,j,h(t) = RIKh(t−x)Xi,j are the smoothed trajectories, Kh(·) = h−1K(·/h) is a
3
Test statistics for two–sample problem
We first consider the problem of testing the hypothesis
H0 :Γ1 =Γ2 against H1:Γ16=Γ2 . (3)
A natural approach is to consider Γbi as the empirical covariance operators of each
popula-tion and construct a statistic Tn based on the difference between the covariance operators
estimators, i.e., to defineTn=nkΓb1−Γb2k2F, wheren=n1+n2.
3.1 The null asymptotic distribution of the test statistic
The following result allows to study the asymptotic behaviour of Tn=nkΓb1−Γb2k2F when
Γ1 = Γ2 and thus, to construct a test for the hypothesis (3.1) of equality of covariance operators.
Theorem 3.1. Let Xi,1,· · ·, Xi,ni, for i = 1,2, be independent observations from two
independent samples inH with meanµi and covariance operator Γi. Let n=n1+n2 and assume also that ni/n → τi with τi ∈ (0,1). Let Γei, i = 1,2, be independent estimators
of the i−th population covariance operator such that √niΓei−Γi
D
−→ Ui, with Ui a
zero mean Gaussian random element with covariance operator Υi. Denote by {θℓ}ℓ≥1 the eigenvalues of the operatorΥ=τ1−1Υ1+τ2−1Υ2 withPℓ≥1θℓ<∞. Then,
nk(Γe1−Γ1)−(Γe2−Γ2)k2F −→D
X
ℓ≥1
θℓZℓ2, (4)
where Zℓ are i.i.d. standard normal random variables. In particular, if Γ1 = Γ2 we have thatnkΓe1−Γe2kF2 −→D Pℓ≥1θℓZℓ2.
Remark 3.1.
a) The results in Theorem 3.1 apply in particular, when considering the sample covari-ance operator, i.e., when Γei =Γbi. Effectively, whenE(kXi,1k4) <∞,√ni
b
Γi−Γi
converges in distribution to a zero mean Gaussian random element Ui of F with
co-variance operator Υi given by (1). As mentioned in the Introduction, the fact that
E(kXi,1k4)<∞ entails that Pℓ≥1θℓ<∞.
b) It is worth noting that if qn is a sequence of integers such that qn → ∞, the fact
that Pℓ≥1θℓ < ∞ implies that the sequence Un = Pqℓn=1θℓZi2 is Cauchy in L2 and
therefore, the limit U =Pℓ≥1θℓZℓ2 is well defined. In fact, analogous arguments to
those considered in Neuhaus (1980) allow to show that the series converges almost surely. Moreover, since Z12 ∼ χ21, U has a continuous distribution function FU and so FUn, the distribution functions of Un, converge to the FU uniformly, as shown in
Remark 3.2. Theorem 3.1 implies that, under the null hypothesis H0 : Γ1 = Γ2, we have that Tn = nkΓb1 −Γb2k2F −→ UD = Pℓ≥1θℓZℓ2, hence an asymptotic test based on
Tn rejecting for large values of Tn allows to test H0. To obtain the critical value, the distribution of U and thus, the eigenvalues of τ1−1Υ1+τ2−1Υ2 need to be estimated. As mentioned in Remark 3.1 the distribution function ofU can be uniformly approximated by that ofUnand so, the critical values can be approximated by the(1−α)−percentile of Un.
Gupta and Xu (2006) provide an approximation for the distribution function of any finite mixture of χ2
1 independent random variables that can be used in the computation of the (1−α)−percentile of Pqn
ℓ=1θbℓZℓ2 where θbℓ are estimators of θℓ. It is also, worth noticing
that underH0:Γ1 =Γ2, we have that fori= 1,2,Υi given in (1) reduces to
Υi=
X
m,r,o,p
smsrsospE[fimfirfiofip]φm⊗φr⊗˜φo⊗φp−
X
m,r
λmλrφm⊗φm⊗˜φr⊗φr
where for the sake of simplicity we have eliminated the subscript 1 and simply denote as
sm=λ1m/2 withλm them−th largest eigenvalue of Γ1 and φm its corresponding
eigenfunc-tion. In particular, if all the populations have the same underlying distribution except for the mean and covariance operator, as it happens when comparing the covariance operators of Gaussian processes, the random function f2m has the same distribution as f1m and so, Υ1=Υ2.
The previous comments motivate the use of the bootstrap methods, due the fact that the asymptotic distribution obtained in (4) depends on the unknown eigenvalues θℓ. It is clear
that when the underlying distribution of the processXiis assumed to be known, for instance,
if both samples correspond to Gaussian processes differing only on their mean and covariance operators, a parametric bootstrap can be implemented. Effectively, denote by Gi,µi,Γi the
distribution of Xi where the parameters µi and Γi are explicit for later convenience. For
each 1≤i≤k, generate bootstrap samplesXi,j⋆ , 1≤j≤ni, with distributionGi,0,Γbi. Note
that the samples can be generated with mean 0 since our focus is on covariance operators. Besides, the sample covariance operatorΓbi is a finite range operator, hence the Karhunen–
Lo´eve expansion (2) allows to generateXi,j⋆ knowing the distribution of the random variables
fiℓ, the eigenfunctions φbi,ℓ of Γbi and its related eigenvalues bλi,ℓ, 1 ≤ ℓ ≤ ni, that is,
the estimators of the first principal components of the process. Define Γb⋆i as the sample covariance operator ofX⋆
i,j, 1≤j≤ni and further, let Tn⋆ =nkΓb ⋆
1−Γb
⋆
2k2F. By replicating
Nboot times, we obtain Nboot valuesTn⋆ that allow easily to construct a bootstrap test.
The drawback of the above described procedure, it that it assumes that the underlying distribution is known hence, it cannot be applied in many situations. For that reason, we will consider a bootstrap calibration for the distribution of the test that can be described as follows,
Step 1 Given a sample Xi,1,· · ·, Xi,ni, let Υbi be consistent estimators of Υi for
i= 1,2. DefineΥb =bτ1−1Υb1+bτ2−1Υb2 withτbi=ni/(n1+n2).
Step 3 GenerateZ1∗, . . . , Zq∗n i.i.d. such thatZi∗∼N(0,1) and letUn∗ =
Pqn
j=1θbjZj∗2. Step 4RepeatStep 3Nboot times, to get Nboot values ofUnr∗ for 1≤r≤Nboot.
The (1−α)−quantile of the asymptotic distribution of Tn can be approximated by the
(1−α)−quantile of the empirical distribution of Unr∗ for 1≤r ≤Nboot. Thep−value can
be estimated by pb=s/Nboot wheresis the number of Unr∗ which are larger or equal than
the observed value ofTn.
Remark 3.3. Note that this procedure depends only on the asymptotic distribution of Γbi. For the sample covariance estimator, the covariance operator Υi is given by (1).
Hence, for Gaussian samples, using thatfij are independent and fij ∼N(0,1),Υi can be
estimated using as consistent estimators of the eigenvalues and eigenfunctions of Γi, the
eigenvalues and eigenfunctions of the sample covariance. For non Gaussian samples,Υi can
be estimated noticing that
simsirsiosipE(fimfirfiofip) =E(hXi,1, φi,mihXi,1, φi,rihXi,1, φi,oihXi,1, φi,pi) .
When considering other asymptotically normally estimators of Γi, such as the smoothed
estimatorsΓsi forL2(I) trajectories, the estimators need to be adapted.
3.2 Validity of bootstrap procedure
The following theorem entails the validity of the bootstrap calibration method. It states that, underH0, the bootstrap distribution ofUn∗ converges to the asymptotic null
distribu-tion ofTn. This fact ensures that the asymptotic significance level of the test based on the
bootstrap critical value is indeed α.
Theorem 3.2. Let qn such that qn/√n→ 0 and X˜n = (X1,1,· · ·, X1,n1, X2,1,· · ·, X2,n2).
Denote by FUn∗|X˜n(·) = P(U
∗
n ≤ · |X˜n). Then, under the assumptions of Theorem 3.1, if √
nkΥb −Υk=OP(1), we have that
ρk(F
U∗
n|X˜n, FU)
p
−→0, (5)
whereFU denotes the distribution function of U =Pℓ≥1θℓZℓ2, withZℓ ∼N(0,1)
indepen-dent of each other, and ρk(F, G) stands for the Kolmogorov distance between distribution
functions F and G.
3.3 Behaviour under contiguous alternatives
In this section, we study the behaviour of the test statistic Tn under a set of contiguous
with alternatives satisfying a functional common principal model. In this sense, under those local alternatives, the processesXi,i= 1,2, can be written as
X1 =µ1+ ∞
X
ℓ=1
λ 1 2
ℓ f1ℓφℓ and X2 =µ2+
∞
X
ℓ=1
λ(2n,ℓ) 1 2
f2ℓφℓ (6)
with λ1 ≥ λ2 ≥ . . . ≥ 0, λ2(n,ℓ) → λℓ at a given rate, while fiℓ are random variables such
that E(fiℓ) = 0, E(fiℓ2) = 1, E(fiℓ fis) = 0 for ℓ6=s. For simplicity, we have omitted the
subscript 1 inλ1,ℓ. Hence, we are considering as alternatives a functional common principal
component model which includes as a particular case, proportional alternatives of the form
Γ2,n =ρnΓ1, with ρn → 1. For details on the functional principal component model, see
for instance, Benko et al. (2009) and Boente et al. (2010).
Theorem 3.3. Let Xi,1,· · ·, Xi,ni for i = 1,2 be independent observations from two
independent distributions in H, with mean µi and covariance operator Γi such that Γ2 =
Γ2,n = Γ1+n−1/2Γ, with Γ = Pℓ≥1∆ℓλℓφℓ⊗φℓ. Furthermore, assume that Xi,j ∼ Xi
where Xi satisfy (6) with λ2(n,ℓ) = λℓ(1 +n−1/2∆ℓ) and that E(kXi,1k4) < ∞ for i = 1,2. Let n = n1 +n2 and assume also that ni/n → τi with τi ∈ (0,1). Let Γbi be the sample
covariance operator of thei−th population and denote by
Υi =
X
m,r,o,p
smsrsospE[fimfirfiofip]φm⊗φr⊗˜φo⊗φp−
X
m,r
λmλrφm⊗φm⊗˜φr⊗φr,
where sm = λ1m/2. Then, if Pℓ∞=1λℓ∆ℓ < ∞, P∞ℓ=1λℓ∆ℓσ4,ℓ < ∞, P∞ℓ=1λℓ∆2ℓσ4,ℓ < ∞,
P∞
ℓ=1λℓ∆2ℓ <∞ and P∞ℓ=1λℓσ4,ℓ<∞, withσ42,ℓ=E(f24ℓ), we get that
a) √n2Γb2−Γ1
D
−→U2+τ21/2ΓwithU2 a zero mean Gaussian random element with covariance operatorΥ2.
b) Denote by {θℓ}ℓ≥1 the eigenvalues of the operator Υ=τ1−1Υ1+τ2−1Υ2. Moreover, let υℓ be an orthonormal basis of F such thatυℓ is the eigenfunction of Υrelated to
θℓ and consider the expansion Γ=Pℓ≥1ηℓυℓ, withPℓ≥1η2ℓ <∞. Then,
Tn =nkΓb1−Γb2k2F −→D
X
ℓ≥1
θℓ
Zℓ+
ηℓ √
θℓ
2
whereZℓ are independent and Zℓ∼N(0,1) .
4
Test statistics for
k
−
populations
In this Section, we consider tests for the equality of the covariance operators of k popula-tions. That is, ifΓi denotes the covariance operator of thei−th population, we wish to test
the null hypothesis
Let n = n1+. . .+nk and assume that ni/n → τi, 0 < τi < 1, Pki=1τi = 1. A natural
generalization of the proposal given in Section 3 is to consider the following test statistic
Tk,n=n k
X
j=2
kΓbj−Γb1k2F, (8)
whereΓbi stands for the sample covariance operator ofi−th population. The following result
states the asymptotic distribution of Tk,n, under the null hypothesis.
Theorem 4.1. Let Xi,1,· · ·, Xi,ni, for 1 ≤ i ≤ k, be independent observations from
k independent distributions in H, with mean µi and covariance operator Γi such that
E(kXi,1k4)<∞. LetΓbi be the sample covariance operator of thei−th population. Assume
Remark 4.1. Note that Theorem 4.1 is a natural extension of its analogous in the finite–dimensional case. To be more precisely, let Zij ∈ Rp with 1 ≤ i ≤ k and 1 ≤ j ≤
ni be independent random vectors and let Σbi be their sample covariance matrix. Then, √n
iVi=√ni(Σbi−Σi)converges to a multivariate normal distribution with mean zero and
covariance matrixΥi. Let
A=
whereIp stands for the identity matrix of orderp. Then, straightforward calculations allow
Therefore, under the null hypothesis of equality of the covariance matricesΣi, we have that
nPki=2kΣbi−Σb1k2 =k√nAVk2 −→D Pkp
4
ℓ=1θℓZℓ2whereV= (V1, . . . ,Vk)andθ1, θ2, . . . , θkp4
are the eigenvalues of Υ. Note that the matrix Υ is the finite dimensional version of the covariance operatorΥw.
Remark 4.2. The conclusion of Theorem 4.1 still holds if, instead of the sample covariance operator, one considers consistent and asymptotically normally distributed estimatorsΓei of
the covariance operatorΓi such that√ni(Γei−Γi)−→D Ui, whereUi is zero mean Gaussian
random element of F with Hilbert Schmidt covariance operator Υei. For instance, the
scatter estimators proposed by Locantore et al. (1999) and further developed by Gervini (2008) may be considered, if one suspects that outliers may be present in the sample. These estimators weight each observation according to their distance to the center of the sample. To be more precise, let us define the spatial median of the i−th population as the value ηi
such that
ηi= argmin θ∈H E
(kXi−θk − kXik) (9)
and the spatial covariance operator Γs i as
Γsi =E (Xi−ηi)⊗(Xi−ηi)/kXi−ηik2 , (10)
with ηi being the spatial median. It is well known that, when second moment exist, Γsi
is not equal to the covariance operator of the i−th sample, even if they share the same eigenfunctions when Xi has a finite Karhunen Lo`eve expansion and the componentsfiℓ in
(2) have a symmetric distribution, see Gervini (2008). Effectively, under symmetry of fiℓ,
ηi =µi and we have thatΓ s i =
P
ℓ≥1λ
s
i,ℓφi,ℓ⊗φi,ℓ with
λs
i,ℓ=λi,ℓE
fiℓ2
P
s≥1λi,sfis2
!
.
The point to be noted here is that even if Γs
i is not proportional to Γi, under the null
hypothesis H0 : Γ1 = . . . = Γk, we also have that H0s : Γ
s
1 = . . . = Γ
s
k is true when the
components fiℓ are such that fiℓ ∼ f1ℓ for 2 ≤ i ≤ k, ℓ ≥ 1 which means that all the
populations have the same underlying distribution, except for the location parameter and the covariance operator. Thus, one can test Hs
0 through an statistic analogous to Tk,n
defined in (8) but based on estimators of Γs i.
Estimators of ηi and Γsi are defined through their empirical versions as follows. The
estimator of the spatial median is the valueηbi minimizing overµthe quantityPnj=1i kXi,j−
µk while the spatial covariance operator estimator is defined as
b
Γsi = 1
ni ni
X
j=1
(Xi,j−bηi)⊗(Xi,j−bηi) kXi,j−ηbik2
.
Gervini (2008) derived the consistency of these estimators and the asymptotic normality of bηi. Up to our knowledge, the asymptotic distribution of Γb
s
However, we conjecture that, when the componentsfiℓin (2) have a symmetric distribution,
its asymptotic behaviour will be the same as that of
e
Γsi = 1
ni ni
X
j=1
(Xi,j−µi)⊗(Xi,j−µi) kXi,j−µik2
.
since bηi is a root−n consistent estimator of ηi =µi. The asymptotic distribution of Γb s i is
beyond the scope of this paper while that of Γesi can be derived from the results in Dauxois et al. (1982) allowing us to apply the results in Theorem 4.1 at least when the center of all the populations is assumed to be known.
Remark 4.3. As in Section 3, a bootstrap procedure can be considered. In order to estimateθℓ, we can consider estimators of the operatorsΥi for1≤i≤kand thus estimate Υw. Therefore, ifθbℓ are the positive eigenvalues ofΥwb , a bootstrap procedure can defined
using Steps 3 and 4 in Section 3.
Acknowledgments
This research was partially supported by Grants X-018 and X-447 from the Universidad de Buenos Aires,pid 216 andpip 592 fromconicet, andpict 821 and 883 fromanpcyt, Argentina.
Appendix
Proof of Theorem 3.1. Since ni/n → τi ∈ (0,1), the independence between the two
estimated operators allows us to conclude that, √
nn(Γe1−Γ1)−(Γe2−Γ2)
o D
−→ √1τ
1
U1− 1 √τ
2
U2 ∼U,
where U is a Gaussian random element of F with covariance operator given by Υ =
τ1−1Υ1+τ2−1Υ2. Then, we easily get
nh(Γe1−Γ1)−(Γe2−Γ2),(Γe1−Γ1)−(Γe2−Γ2)iF −→D
X
ℓ≥1
θℓZℓ2,
where{θℓ}ℓ≥1 are the eigenvalues associated to the operatorΥ.
Proof of Theorem 3.2. Let ˜Xn = (X1,1,· · ·, X1,n1, X2,1,· · ·, X2,n2), ˜Zn = (Z1,· · ·, Zqn)
and ˜Z = {Zℓ}ℓ≥1 with Zi ∼ N(0,1) independent. Define Ubn( ˜Xn,Z˜n) = Pℓqn=1θbℓZℓ2, Un( ˜Zn) =Pqℓ=1n θℓZℓ2 andU( ˜Z) =P∞ℓ=1θℓZℓ2.
First note that |θbℓ−θℓ| ≤ kΥb −Υkfor each ℓ (see Kato, 1966), which implies that
qn
X
ℓ=1
|θbℓ−θℓ| ≤
qn
On the other hand, we have
We also have the following inequalities
P(Ubn≤t|X˜n) = P(Ubn≤t∩ |Un− U|< ǫ|X˜n) +P(Ubn≤t∩ |Un− U|> ǫ|X˜n)
As we mentioned in Remark 3.1, FU is a continuous distribution function on R and so, uniformly continuous, hence limǫ→0 supt∈R ∆ǫ(t) = 0, which implies thatρk(F
Proof of Theorem 3.3. Using the Karhunen–Lo´eve representation, we can write
(1/n2)Pnj=12 (Z0,j⊗Vj+Vj⊗Z0,j). Using thatX2,j−µ2=Z0,j+Vj, we obtain the following
expansion Γe2−Γ1=ΓbZ0+ΓbV +Ae.
The proof will be carried out in several steps, by showing that √n
where U2 is a zero mean Gaussian random element with covariance operator Υ2. Using that the covariance operator ofZ0,j is Γ1, (A.5) follows from Dauxois et al. (1982).
We will derive (A.2). Note that X2,j −µ2 = Z0,j +Vj and Γb2−Γe2 = − X2−µ2⊗
X2−µ2. Then, it is enough to prove that √n2 X2−µ2=√n2 Z0+V=OP(1).
By the central limit theorem in Hilbert spaces, we get that √n2Z0 converges in distri-bution, and so it is tight, i.e., √n2Z0=OP(1).
where the last bound follows from the Cauchy–Schwartz inequality and the fact that E(f2
thath(1 + ∆ℓ/√n)1/2−1
Finally, to derive (A.4) note that analogous arguments allow to show that
E(n2kAe −E(Ae)k2) ≤
concluding the proof of (A.4). The proof of a) follows easily combining (A.2) to (A.5).
b) From a), we have that√nΓb2−Γ1
D
−→Γ+(1/√τ2)U2whereU2 a zero mean Gaussian random element with covariance operator Υ2. On the other hand, the results in Dauxois et al. (1982) entail that √n1 element with covariance operator Υ1 and so, √n
b
Γ1−Γ1
D
To conclude the proof, we have to obtain the distribution of kΓ+Uk2F. Since U is a zero mean Gaussian random element of F with covariance operator Υ, we have that
U can be written as Pℓ≥1θ1ℓ/2Zℓυℓ where Zℓ are i.i.d. random variables such that Zℓ ∼
N(0,1). Hence,Γ+U=Pℓ≥1ηℓ+θ1ℓ/2Zℓ
υℓand so,kΓ+Uk2F =Pℓ≥1
ηℓ+θ1ℓ/2Zℓ
2 =
P
ℓ≥1θℓ
ηℓ/θℓ1/2+Zℓ
2
concluding the proof.
Proof of Theorem 4.1. Consider the process Vk,n ={√n(Γbi−Γi)}1≤i≤k. The
indepen-dence of the samples and among populations together with the results stated in Dauxois et al. (1982), allow to show that Vk,n converges in distribution to a zero mean Gaussian
random elementU of Fk with covariance operator ˜Υ. More precisely, we get that
{√n(Γbi−Γi)}1≤i≤k−→D U= (U1,· · ·,Uk)
whereU1,· · ·,Ukare independent random processes ofF with covariance operatorsτi−1Υi,
respectively.
LetA:Fk→ Fk−1be a linear operator given byA(V
1,· · ·, Vk) = (V2−V1,· · ·, Vk−V1). The continuous map Theorem guarantees that A(√n(Γb1−Γ1),· · ·,√n(Γbk−Γk))−→D W,
whereW is a zero mean Gaussian random element ofFk−1 with covariance operatorΥw=
AΥ˜A∗ whereA∗ denote the adjoint operator ofA. It is easy to see that the adjoint operator
A∗ : Fk−1 → Fk is given by A∗(u1, . . . , uk−1) = (−Pki=1−1ui, u1, . . . , uk−1). Hence, as
U1,· · ·,Uk are independent, we conclude that
Υw(u1, . . . , uk−1) =
1
τ2
Υ2(u1), . . . , 1
τk
Υk(uk−1)
+ 1
τ1
Υ1
k−1
X
i=1
ui
!
.
Finally,
Tk,n=n k
X
j=2
kΓbj−Γb1k2F −→D
X
ℓ≥1
θℓZℓ2
where Zℓ are i.i.d standard normal random variables and θℓ are the eigenvalues of the
operator Υw.
References
[1] Benko, M., H¨ardle, P. & Kneip, A. (2009). Common Functional Principal Components. Annals of Statistics,37, 1-34.
[2] Boente, G. & Fraiman, R. (2000). Kernel-based functional principal components. Statistics and Probabability Letters,48 , 335-345.
[4] Cardot, H., Ferraty, F., Mas, A. & Sarda, P. (2003). Testing hypotheses in the func-tional linear model.Scandinavian Journal of Statistics,30, 241255.
[5] Cuevas, A., Febrero, M. & Fraiman, R. (2004). An anova test for functional data.
Computational Statistics & Data Analysis, 47, 111122.
[6] Dauxois, J., Pousse, A. & Romain, Y. (1982). Asymptotic theory for the principal com-ponent analysis of a vector random function: Some applications to statistical inference. Journal of Multivariate Analysis,12, 136-154.
[7] Gabrys, R. & Kokoszka, P. (2007), Portmanteau Test of Independence for Functional Observations.Journal of the American Statistical Association,102, 13381348.
[8] Gabrys, R., Horvath, L. & Kokoszka, P. (2010). Tests for error correlation in the functional linear model. Journal of the American Statistical Association, 105, 1113-1125.
[9] Fan, J. & Lin, S.-K. (1998). Tests of significance when the data are curves.Journal of the American Statistical Association,93, 10071021.
[10] Ferraty, F., Vieu, Ph. & Viguier–Pla, S. (2007). Factor-based comparison of groups of curves.Computational Statistics & Data Analysis,51, 4903-4910.
[11] Gupta, A. & Xu, J. (2006). On some tests of the covariance matrix under general conditions.Annals of the Institute of Statistical Mathematics, 58, 101114.
[12] Horv´ath, L., Huskov´a, M. & Kokoszka, P. (2010). Testing the stability of the functional autoregressive process.Journal of Multivariate Analysis,101, 352367.
[13] Kato, T. (1966).Perturbation Theory for Linear Operators. Springer-Verlag, New York.
[14] Ledoit, O. & Wolf, M. (2002). Some hypothesis tests for the covariance matrix when the dimension is large compared to the sample size.Annals of Statistics,30, 4, 1081-1102.
[15] Neuhaus, G. (1980). A note on computing the distribution of the norm of Hilbert space valued gaussian random variables.Journal of Multivariate Analysis,10, 19-25.
[16] Panaretos, V. M., Kraus, D. & Maddocks, J. H. (2010). Second-Order Comparison of Gaussian Random Functions and the Geometry of DNA Minicircles. Journal of the American Statistical Association, 105, 670682.
[17] Schott, J. (2007). A test for the equality of covariance matrices when the dimension is large relative to the sample sizes.Computational Statistics & Data Analysis,51, 12, 6535-6542.
[18] Seber, G. (1984) Multivariate Observations. John Wiley and Sons.
[19] Shen, Q. & Faraway, J. (2004). An F-test for Linear models with functional responses. Statistica Sinica,14, 12391257.