3 Modification and Robustness 3.1 Model and Assumptions

Given Xi, let Yi = Xi − μ with E(Yi) = 0, Cov(Yi) = . Having relaxed normality, our robustness evaluation will be based on the following general multivariate model:

Yi = Ui, (4)

whereUi ∈R^pwithE(Ui)=0,Cov(Ui)=I, and ∈R^p^×^pis a known constant matrix with ^T = A, ^T = > 0, where Ais any positive semi-definite matrix. Model (4) contains multivariate normality as a special case and is often used for general multivariate inference; see Ahmad (2017) for details and references.

The normality-based modification in Ahmad and Ahmed (2020) is essentially based on a single extra assumption, stated as Assumption3.1below, whereas, for the present case under Model (4), we additionally need Assumptions 1–3. Letλj, j = 1, . . . , pdenote the eigenvalues of, so thatν1, . . . , νp,νj = λj/p, denote those of=/p. Further, letE(U_ij³)=γ1∈RandE(U_ij⁴)=γ2+3,γ2∈R⁺, be the third and fourth moments of the elements ofU, respectively.

Assumption 1limp→∞#p

j=1ν_j =O(1).

Assumption 2Letγ₁, γ₂<∞, whereγ₁, γ₂are defined above.

Assumption 3limp→∞t r(AA)/t r(A⊗A)=0 withAa positive semi-definite matrix, whereand⊗denote Hadamard and Kronecker products, respectively.

Assumption 1 is inevitably needed under Model (4), since the computations involve second moments of quadratic forms. Assumption 2 puts a bound on the average of the scaled eigenvalues, ν_j. It is simple, effective, and commonly used in high-dimensional inference; moreover, it has an interesting consequence, limp→∞#p

s=1ν_j²=O(1)which will be referred to in the sequel. To see practical applicability of Assumption 2 and its consequence, letbe compound symmetric, which belongs to the group of spiked covariance structures, i.e.,=(1−ρ)I+ρJ withIas identity matrix,J=11,1a vectors of 1s, andρ∈R,−1/(p−1)≤ρ ≤1.

It can be easily shown thatt r(ⁱ)=O(pⁱ),i=1,2, which satisfies the assumption and its consequence. Finally, Assumption 3 is mild, because the numerator is a much smaller term, in terms ofp, than the denominator.

Note that, in the computations below, in Model (4) will essentially appear as ^1/2, so thatAwill be representing. For example, the traces in Assumption 3 can be considered forAas well as for. In this context, with normality relaxed, Assumption 3 controls the behavior of the moments of estimators that compose the modified statistic; see Theorem1below.

3.2 Statistic, Its Limit, and Robustness

For brevity, we only focus on the charts for an individual observations case. The case of subgroup-means follows similarly. The statistic for an individual observations case,T_i², is given in Eq. (2), which, under the normality assumption and for fixed p, follows a beta distribution which provides the upper limit given after Eq. (2).

Alternatively, as an approximation for fixedpandn→ ∞, aχ_p²limit can also be used. These limits, however, are not applicable for a high-dimensional setup, i.e., whenpis large, and particularly forp n, mainly due to the singularity ofin T_i². The modified form ofT_i² in Ahmad and Ahmed (2020), valid for thep n case, is defined as

Wi = n n−1

di 2

t r(), (5)

where = /p, di = Xi/√

p, i = 1, . . . , n, · denotes the Euclidean vector norm andt r(·)is the trace operator. It is shown that, under normality and Assumption 2,

W_i−E(W_i) σ_W_i

−→D N (0,1), (6)

asn, p→ ∞, whereE(W_i)=1+o_P(1)andσ_W²

i, a consistent estimator ofσ_W²

i, is defined below; see also Theorem2and Corollary 4 in the reference mentioned above. To motivate the evaluation of W_i in (5) for robustness under the general multivariate model, (4), we first note that the limit ofW_i in Ahmad and Ahmed (2020) is obtained by using the consistency oft r()forn, p → ∞and showing thatW_ihas the same limit as that of

Ai = n n−1

di 2

t r(). (7)

Under normality assumption, E(Ai) = 1, V ar(Ai) = 2/f, where f = [t r()]²/t r(²). This helps determine the limit of Ai, and thus that of Wi, as χ_f²/f whose first two moments coincide with those ofAi.

When we replace normality with Model (4),E(Ai) = 1 remains same, using E( di 2)= [(n−1)/n]t r(), butV ar( di 2)differs. The following lemma, the proof of which follows easily using Theorem3, collects the moments of di 2under Model (4):

Lemma 1 For di 2defined above, we have, under Model (4),

E( di 2)= n−1 n t r() V ar( di 2)=

n−1 n

2t r(²)+M₁ p² 6

Cov( di 2, dj 2)= 1 n²

2t r(²)+M1

p² 6

∀i=j,i, j=1, . . . , n, whereM1is defined in Theorem3.

Note that, the moments in Lemma1reduce to those in Ahmad and Ahmed (2020) under normality whenγ2 =0⇒M1 =0. Further, these moments help us define the moments of the traces involved in the limit of Wi, and particularly f given above, which in turn justifies the use of Assumption 3 involving the trace operator with Hadamard product. In this context, we exploit the equivalence of the limits ofAi andWi and consider di 2/t r(), where the scaling trace factor will make the terms involving Hadamard product vanish under the assumption. Finally, this last point will further help us, using the covariance part in Lemma1, obtain the multivariate limit of the vector of di 2. To approach this multivariate limit, write A_iin (7) as

A_i = n

n−1a_i, (8)

where ai = di 2/t r(),i = 1, . . . , n. Asai are correlated, we are essentially seeking the distribution of the vectora=(a1, . . . , an)^T. From Lemma1, it follows

thatE(a_i)= [n/(n−1)]and it holds without Model (4) or any assumption. Further, forn → ∞and fixed p,E(a_i) → ∞,V ar(a_i) → 2/f andCov(a_i, a_j) → 0, without needing any assumptions. Now, when we let n, p → ∞, the so-called high-dimensional setup, then, under Assumption 2 and its consequence, f is uniformly bounded inp (using the moments in Theorem1below) so that, under Assumptions 1–3, the convergence fora_i, and therefore also that ofA_i, or the vector A=(A₁, . . . , A_n), may hold conveniently. For this, we use Lemma1for Eq. (8) and note, fori=j,i, j =1, . . . , n, that

E(A_i)=1 V ar(A_i)= 2

f 1

1+ M1

[t r()]² p s=1

ν_j² 2

Cov(A_i, A_j)= 2

f · 1

(n−1)² 1

1+ M₁ [t r()]²

p s=1

ν_j² 2

where ν_j are the eigenvalues of ; see the assumptions above. Now, under the consequence of Assumption 2 discussed above, limp→∞#p

s=1ν_j² in V ar(A_i) andCov(A_i, A_j)is uniformly bounded where the fractional term involving M₁ vanishes under Assumption 3. Note that, this vanishing limit can also be obtained by replacing Assumption 3 witht r(AA)/p²→0 asp→ ∞. But, we also need to assume a simultaneous rate of convergence ofpandn, i.e.,p/n→ c∈(0,∞) as n, p → ∞. Assumption 3, for which the denominator implies t r(A⊗A) = t r(⊗)= [t r()]², helps us avoid any such(n, p)-relationships.

This argument implies that, even under Model (4), moments ofA_i, and later of W_i, behave similarly as under normality so that a limit similar to that in Ahmad and Ahmed (2020) may be obtained. For the entire vectorA, we can now write

E(A)=1, Cov(A)= 2

fIn[1+O(1)] + [(Jn−In)O(n⁻²)], (9) whereIn is identity matrix, Jn = 1n1^T_n with1n a vector of 1s, so that, using the limits forA_i,

n,plim→∞Cov(A)= 2

fIn[1+o(1)], (10) under the assumptions. AsCov(A)is a diagonal (in fact, a spherical) matrix in the limit,Ai are asymptotically independent and a limit ofAfollows by the central limit theorem. Note that, for such vectors with correlated elements, the essential requirement for multivariate limit is that the covariances, Cov(Ai, Aj), or more precisely, the corresponding correlations, converge to the same fixed constant. This limit, in our case, is 0, makingCov(A)a diagonal matrix.

It follows from the above arguments that a limit of W_i under Model (4), similar to (6), follows if consistent and efficient estimators of the traces involved inf are defined for Model (4) under a high-dimensional asymptotic setup. The estimators in Ahmad and Ahmed (2020) are indeed non-parametrically defined and hence applicable under Model (4) as well. These estimators, oft r(),t r(²)and [t r()]², respectively, are defined as

E₁=t r() (11)

E2=η{(n−1)(n−2)t r(²)+ [t r()]²−nQ} (12) E₃=η{2t r(²)+(n²−3n+1)[t r()]²−nQ}, (13) whereη=(n−1)/[n(n−2)(n−3)]andQ=#_n

i=1q_i²/(n−1)withq_i = di 2. For an equivalentU-statistics formulation of the estimators, justifying their non- parametric nature, see Sect.8. Thus, to use them in the present context of robustness, we need efficient and consistent moments of these estimators under Model (4). They are given in the following theorem, proved in Sect.8, which reduce to Theorem 3 in Ahmad and Ahmed (2020) under normality.

Theorem 1 The estimators, E₁, E₂, and E₃, defined in Eqs. (11)–(13), are unbi- ased fort r(),t r(²)and[t r()]², respectively, with

V ar(E₁)= 2

n−1t r(²)+M₁ V ar(E₂)= 4

P (n) 3

a(n)t r(⁴)+b(n)[t r(²)]²+2c(n)M1+d(n){6M2+M₃}

−2e(n)M4

4 V ar(E₃)= 4

P (n) 3

4t r(⁴)+f (n)[t r(²)]²+g(n)t r(²)[t r()]²+d(n)M₁² +h(n)M1t r(²)+k(n)M1[t r()]²−2e(n)[M5+M4]4

, wherea(n)=2n³−12n²+21n−5,b(n)=n²−6n+11,c(n)=(n−1)(n−3)², d(n) = (n−2)(n−3)/2,e(n) = n−3,f (n) = n²−6n+10,g(n) = (n− 2)(n−3)(2n−3),h(n)=(n−3)(2n−5),k(n)=(n−1)(n−2)(n−3),M₁= γ₁t r(AA),M₂=γ₁t r(A²A²),M₃=γ₁²t r(AA)²,M₄=γ₂²t r[(AA)A²], M5=γ₂²t r(AAA²). Further,V ar(Ei)and likewiseCov(Ei, Ej)are uniformly bounded byO(1/n),i, j=1,2,3,,i=j.

From Theorem1, the variances and covariances are uniformly bounded inpwhere the bounds only depend onn. This important consequence will help us arrive at the limit of the test statistic conveniently, which in turn ensuresf= E₃/E₂as a consistent estimator off, implying a consistent estimator of the test statistic, i.e.,

2/f. In summary, we have the following theorem, the proof of which is sketched in Sect.9. Note that, following the arguments around Eqs. (6) and (8), using the consistency oft r(), it immediately follows thatE(W_i) = 1+o_P(1) → 1 for n, p→ ∞, same as under normality.

Theorem 2 GivenW_iin Eq. (5), Model (4) and Assumptions 2–3. Then, asn, p→

∞

W_i−E(W_i) σ_W_i

−→D N (0,1),

whereE(W_i) = 1,σ_W²

i = 2/fwithf= E₃/E₂ a consistent estimator off = [t r()]²/t r(²).

Although, Theorem2 deals with a univariate limit, it follows from the moments ofain Eq. (9) and the arguments around it that the multivariate limit of the vector W= (W₁, . . . , W_n) can also be conveniently obtained, through a similar limit of A = (A₁, . . . , A_n), so that the required limit in Theorem 2 follows simply as a marginal projection. In fact,Cov(A), forn, p→ ∞, has the same limit,[2/f]I[1+ o(1)], as that ofain Eq. (10). WithCov(A_i, A_j)=0, makingA_i’s asymptotically independent and the variances uniformly bounded inp, the limit ofa, eventually of A, follows as

9f/2(A−E(A))−→^D N_n(0,I),

as n, p → ∞. Likewise, the limit of Wfollows by replacingf with its (n, p)- consistent estimator,E₂/E₃. Theorem2extends the use of modifiedT²statistic for statistical control to a general model covering normality as a special case.

A very similar approach, with precisely the same limit, holds for the phase II chart of future observation and also for both types of charts for subgroup-means as well. In fact, as shown in Ahmad and Ahmed (2020), the convergence of the limit in case of subgroup-means is relatively better because the statistics are composed of averages. To avoid repetition, we shall not discuss these cases here, but they can be approached following the same steps as above.

Dalam dokumen (ICSA Book Series in Statistics) Wenqing He, Liqun Wang, Jiahua Chen, Chunfang Devon Lin - Advances and Innovations in Statistics and Data Science-Springer (2022) (Halaman 136-141)