• Tidak ada hasil yang ditemukan

Cut-off point of linear discriminant rule for large dimension.

N/A
N/A
Protected

Academic year: 2024

Membagikan "Cut-off point of linear discriminant rule for large dimension."

Copied!
11
0
0

Teks penuh

(1)

Cut-off point of linear discriminant rule for large dimension

Takayuki Yamada1, Tetsuto Himeno2and Tetsuro Sakurai3

1 General Studies, College of Engineering, Nihon University,

1 Nakagawara, Tokusada, Tamuramachi, Koriyama, Fukushima 963-8642, Japan

2Department of Computer and Information Science, Faculty of Science and Technology, Seikei University,

3-3-1 Kichijoji-Kitamachi, Musashino-shi, Tokyo 180-8633, Japan

3Center of General Education, Tokyo University of Science, Suwa,

5000-1, Toyohira, Chino-shi, Nagano, 391-0292, Japan

Abstract

This paper is concerned with the problem of classifying a observation vector into one of two pop- ulations Π1:Np(µ1,Σ) and Π2:Np(µ2,Σ). Anderson (1973, Ann. Statist.) gave an asymptotic expansion of the studentized statistic, and derived cut-off point to achieve a specified probability of misclassification. But the dimensionpgets large, the precision becomes worse. So in this paper, we proposed studentized statistic in terms of (n, p) asymptotic. An asymptotic expansion of the statis- tic is derived up to the orderO1, whereO1is a term with respect to{p1/2, N11/2, N21/2, m1/2} for each sample sizeNi andm=N1+N22−p. Using the expansion, we gave cut-off point to achieve a specified probability of misclassification.

Keywords: Linear discriminant rule, cut-off point, (n, p) asymptotic AMS 2010 subject classification: Primary 62H30, Secondary 62E20

1 Introduction

This paper is concerned with the problem of classifying a observation vector into one of two populations Π1 :Np(µ1,Σ) and Π2 :Np(µ2,Σ). The observation x is classified as coming from either Π1 or Π2 based on the samples

xi1, . . . ,xiNi ∼Np(µi,Σ) (i= 1,2),

which are independent. For this problem, linear discriminant analysis is used. Let W = (¯x1x¯2)S1

{ x1

2(¯x1+ ¯x2) }

,

where ¯x1, ¯x2andS are the sample mean vectors and the pooled sample covariance matrix defined by

¯ xi= 1

Ni Ni

j=1

xij, i= 1,2, S= 1 n

2

i=1 Ni

j=1

(xijx¯i)(xijx¯i), n=N−2 =N1+N22.

Linear discriminant rule classifiesxas Π1ifW(x)> cand Π2ifW(x)< cfor a constantc. Inference concerning linear discriminant analysis is studied under large sample asymptotic framework A0:

A0 :N1→ ∞, N2→ ∞, N1/N2→c∈(0,∞).

1E-mail address: [email protected]

2E-mail address: [email protected]

3E-mail address: [email protected]

(2)

For a review of results under A0, see, e.g., Fujikoshi et al. [4]. When p becomes large, accuracy of results proposed using A0 gets worse. As a way to improve the poorness, it is studied under the asymptotic framework A1:

A1 :p→ ∞, N1→ ∞, N2→ ∞, p/N1→γ1(0,1), p/N2→γ2(0,1) and N1/N2→γ∈(0,∞).

As results under A1, Fujikoshi and Seo [3] gave an asymptotic approximation of the probabilities of misclassification, Fujikoshi [2] gave its error bound. These results are reviewed in Fujikoshi et al. [4].

Following Lachenbruch [5], forxΠi, W = (¯x1x¯2)S1

{ x1

2(¯x1+ ¯x2) }

=V1/2Zi+ (1)iUi, (1) where

V = (¯x1x¯2)S1ΣS1x1x¯2), Zi=V1/2x1x¯2)S1(xµi), Ui= (1)i+1x1x¯2)S1xiµi)1

2D2,

and D2 is the squared sample Maharanobius distance defined by D2 = (¯x1 x¯2)S1x1x¯2).

From the normality of x, Zi is distributed as the standard normal distribution, which we denote it as Zi N(0,1). Since it does not depend on {x¯1,x¯2,S}, Zi is independent to the set, and so Zi

is independent from {Ui, V}. The limiting distribution of W under the asymptotic framework A1 is normal with meanui,0= (1)ilimA1E[Ui] and variancev0= limA1{E[V] + Var(Ui)}ifx∈Np(µi,Σ).

Under the assumption that the Mahalanobis distance ∆ =

(µ1µ2)Σ1(µ1µ2) converges to a positive constant under A1, Fujikoshi and Seo [3] and Fujikoshi [2] showed that Var(Ui)0. So, we can abbreviate asv0= limA1E[V].

One may want to determine the cut-off point c to adjust the probabilities of misclassification.

Results under A0 is written in Anderson [1]. On the other hand, from Fujikoshi [2], under the as- sumption thatxΠi, the limiting distribution (W −ui)/√

v isN(0,1), whereui andv are constants such that limA1(ui−ui,0) = 0 and limA1(v−v0) = 0. Using this result, we find that the approxi- mation of the misclassification probability that xis allocated to Πj even thoughx Πi is given as Φ((1)i1(c−ui)/√

v) fori, j = 1,2 with = j, where Φ(.) is the cumulative distribution function of the standard normal distribution. Since ui and v contain ∆2, which need to be estimated. The unbiased estimator is given as

∆c2=n−p−1

nx1x¯2)S1x1x¯2) pN N1N2

.

Consistency under the asymptotic framework A1 holds. One can choose c from the fact that the limiting distribution of (W−ubi)/√

ˆ

visN(0,1) ifx∈Np(µi,Σ), where (ubi,v) is (uˆ i, v) with replacing

2 by∆c2.

This paper is organized as follows. In Section 2, we give an asymptotic expansion of the (W−ubi)/√ ˆ v for the case thatxΠi. We propose a cut-off point which the misclassification probability becomes presetting level, asymptotically. Section 3 presents simulation results for misclassification probability.

Proof of lemma and derivation of expectations are given in Appendix.

2 Asymptotic expansion under A1

For technical reason, set

ui= n 2(m−1)

{

(1)i+12+ ( p

N2 p N1

)}

, v= n2(n+ 1)

(m−1)(m+ 1)(m+ 2) (

2+ N p N1N2

) ,

(3)

where m =n−p. Note that ui = (1)iE[Ui], but v equals E[V], asymptotically, under A1. Then limA1ui=ui,0 and limA1v=v0. Using unbiased estimator of (ui, v), we have

P (

W√−cu1

ˆ v < x

xΠ1

)

=E [

Φ (

ˆ

vx+√U1+cu1

V

)]

,

P (

W−cu2

√vˆ > x xΠ2

)

=E [

Φ (−√

ˆ

vx+U2−uc2

√V

)]

. Let

u1= ( 1

N1 + 1 N2

)1/2

Σ1/2x1x¯2), u2= 1

√NΣ1/2(N1x¯1+N2x¯2−N1µ1−N2µ2), B=Σ1/2SΣ1/2,

where N = N1 +N2. Then u1, u2 and B are independent. In addition, u1 Np((1/N1 + 1/N2)1/2δ,Ip) and u2 Np(0,Ip), where δ = Σ1/2(µ1 µ2). It also holds that nB is dis- tributed as a Wishart distribution with degrees of freedomn=N−2 and covariance matrixIp, which is denoted asWp(n,Ip). Substituting them,

Ui=(1)i+1 2

( p N2 p

N1

)u1B1u1

p +(1)i+1p

√N1N2

u1B1u2

p −τi

δB1u1

√p , V = N p

N1N2

u1B2u1

p , τi=

pN3/2+(1)i+1/2

N N3/2(1)i+1/2

. In addition,

∆c2= N p N1N2

{m−1 n

u1B1u1 p 1

} . Then,

b

ui= (1)i+1 n 2(m−1)

{ N p N1N2

m−1 n

u1B1u1

p N p

N1N2 + (1)i+1 ( p

N2 p N1

)}

, ˆ

v= n(n+ 1) (m+ 1)(m+ 2)

N p N1N2

u1B1u1

p .

The following lemma gives that these random variables can be expressed as functions of the inde- pendent standard normal and chi-squared variables, simultaneously.

Lemma 1. Let v1∼Np(δ,Ip),v2∼Np(0,Ip),A∼Wp(n,Ip), andv1,v2 andAare independent.

Then the following equalities in distribution hold, simultaneously:

S δA1v1=DY1

(

Z1+ ∆

Y2

Y3

Z2

) ,

T v2A1v1=D

√ 1 Y12

( 1 + Y2

Y3

)

{(Z1+ ∆)2+Z22+Y4}Z3 U v1A1v1=D 1

Y1{(Z1+ ∆)2+Z22+Y4}, V v1A2v1=D 1

Y12 (

1 +Y2

Y3

)

{(Z1+ ∆)2+Z22+Y4},

(4)

where ∆ =

δδ; Z1, Z2, Z3, Y1, . . . , Y4 are independent, Zi N(0,1), i = 1,2,3, Yi χ2f

i, i = 1, . . . ,4,

f1=n−p+ 1, f2=p−1, f3=n−p+ 2, f4=p−2.

Note that Fujikoshi and Seo [3] has also given similar results, but their results are individual ones, so cannot treat simultaneously. The proof of Lemma 1 is given in Appendix.

It can be described that (1)i+1

b

vx+Ui(1)iubi

= (1)i+1

n(n+ 1) (m+ 1)(m+ 2)ω1

Q1

p x−(1)i+1 2

( p N2 p

N1 )Q1

p +(1)i+1p

√N1N2

B2

p −τi

B1

√p +ω2

2 Q1

p n

2(m−1)ω2+(1)i+1 2

n m−1

( p N2 p

N1 )

, (2)

V =ω2Q2

p , (3)

whereQ1=u1B1u1,Q2=u1B2u1,B1=δB1u1andB2=u2B1u1,ω2=N1N2/(N p). From Lemma 1, we have

Q1

p

=D n f1

1 1 +√

2/f1W1

S, B1

√p

=D n f1

∆ 1 +√

2/f1W1

( Z1

√p+ω

f2

f3

√TZ2

√p )

, B2

p

=D n f1

1 1 +√

2/f1W1

√(

1 +f2 f3

T )

SZ3

√p, Q2

p

=D n2 f12

1 (1 +√

2/f1W1)2 (

1 +f2

f3T )

S, whereWi=√

fi/2(Yi/fi1) fori= 1, . . . ,4, S=

(Z1

√p+ω

)2

+ (Z2

√p )2

+p−2 p

( 1 +

√2 f4

W4

) , T =1 +√

2/f2W2

1 +√ 2/f3W3

.

SortingS in descending order, it can be expressed as

S=s0+S1/2+S1+O3/2, where

s0= 1 +ω22, S1/2= 2ω

√p Z1+

√2 f4W4, S1= Z12

p +Z22 p 2

p,

andOj/2is a term ofj-th order with respect to{p1/2, N11/2, N21/2, m1/2}. By Maclaurin expansion of (1 +√

2/f3W3)1 up to the term with order off31, T =

( 1 +

√2 f2

W2

) ( 1

√2 f3

W3+ 2 f3

W32 )

+O3/2,

(5)

which can be sorted in descending order as

T = 1 +T1/2+T1+O3/2

with

T1/2=

√2 f2W2

√2 f3W3, T1= 2

f3

W32 2

√f2f3

W2W3. Doing Maclaurin expansion of (1 +√

2/f1W1)1 in Q1/pup to the term with order off11, and sort it in descending order, it is written as

Q1

p =q1,0+Q1,1/2+Q1,1+O3/2, where

q1,0= n f1s0, Q1,1/2= n

f1

( S1/2

√2 f1

s0W1 )

, Q1,1= n

f1

( S1

√2 f1

S1/2W1+ 2 f1

s0W12 )

, and using this expansion,

Q1

p =√q1,0

{ 1 + 1

2q1,0(Q1,1/2+Q1,1) 1

8q1,02 Q21,1/2 }

+O3/2. Using similar way, we can expressQ2/pas

Q2

p =q2,0+Q2,1/2+Q2,1+O3/2, where

q2,0= (n

f1

)2( 1 +f2

f3

) s0, Q2,1/2=

(n f1

)2[{(

1 + f2 f3

)

S1/2+f2 f3

s0T1/2 }

2

√2 f1

( 1 +f2

f3

) s0W1

] , Q2,1=

(n f1

)2[{(

1 + f2

f3 )

S1+f2

f3S1/2T1/2+f2

f3s0T1 }

2

√2 f1

{(

1 + f2

f3 )

S1/2+f2

f3s0T1/2 }

W1 +6

f1 (

1 + f2

f3 )

s0W12 ]

. In addition, it is also expanded that

B1

√p =b1,0+B1,1/2+B1,1+O3/2, where

b1,0= n f1ω2, B1,1/2= n

f1

{(

Z1

√p−

f2 f3

Z2

√p )

√2 f1

ω2W1 }

,

B1,1= n f1

{

∆ 2

f2

f3 Z2

√pT1/2

√2 f1

( Z1

√p−

f2

f3 Z2

√p )

W1+ 2

f1ω2W12 }

.

(6)

Sorting in descending order, it can be described that (

1 + f2

f3T )

S=s0

( 1 +f2

f3 )

(1 + ˜T1/2+ ˜S1/2) +O1, where ˜S1/2=S1/2/s0 and ˜T1/2={(f2/f3)/(1 +f2/f3)}T1/2, and so

√(

1 +f2 f3

T )

S=

s0

( 1 + f2

f3

) ( 1 + 1

2( ˜T1/2+ ˜S1/2) )

+O1. Substituting this expansion intoB2/p, and using Maclaurin expansion of (1 +√

2/f1W1)1 up to the term with order off11/2, we have

B2

p =B2,1/2+B2,1+O3/2, where

B2,1/2= n f1

√(

1 + f2 f3

) s0

Z3

√p,

B2,1= n f1

√(

1 +f2

f3 )

s0

{1

2( ˜T1/2+ ˜S1/2)

√2 f1W1

} Z3

√p.

Substituting these expansions in (2) and (3), and coordinating it in order, (2) and (3) can be represented as the sum of terms with descending order, respectively, which are

(1)i+1 ˆ

vx+Ui(1)iubi= (1)i+1

n2(n+ 1)

(m+ 1)2(m+ 2)s0ω1x+Ui,1/2+Ui,1+O3/2, V =ω2 n2(n+ 1)

(m+ 1)2(m+ 2)s0+ω2Q2,1/2+ω2Q2,1+O3/2, where

Ui,1/2=AiQ1,1/2−τiB1,1/2+(1)i+1p

√N1N2

B2,1/2, Ui,1=AiQ1,1(1)i+1

n(n+ 1) (m+ 1)(m+ 2)

ω1x 8q3/21,0

Q21,1/2+(1)i+1p

√N1N2

B2,1−τiB1,1

+ n

(m−1)(m+ 1) [

(1)i+1 ( p

N2 p N1

)

−ω2 ]

, Ai= (1)i+1

n(n+ 1) (m+ 1)(m+ 2)

ω1

2√q1,0x−(1)i+1 2

( p N2 p

N1 )

+ω2 2 . These expansions lead that

Ri= (1)i+1 ˆ

vx+√Ui+ (1)i+1ubi

V = (1)i+1x+Ri,1/2+Ri,1+O3/2,

where

Ri,1/2=1

2(1)i+1xQ˜2,1/2+ ˜Ui,1/2, Ri,1= (1)i+1x

(3 8

Q˜22,1/21 2

Q˜2,1

)

1 2

U˜i,1/2Q˜2,1/2+ ˜Ui,1

(7)

with

U˜i,j/2=Ui,j/2 / {√

n2(n+ 1)

(m+ 1)2(m+ 2)s0ω1 }

, Q˜i,j/2=Qi,j/2

/ { n2(n+ 1) (m+ 1)2(m+ 2)s0

} . By Taylor expansion,

Φ(Ri) = Φ((1)i+1x) +ϕ((1)i+1x)[

Ri,1/2+Ri,1

](1)i+1x

2 ϕ((1)i+1x)R2i,1/2+O3/2. SinceRi,1/2 is represented as the linear combination of{Z1, Z2, Z3, W1, . . . , W4}, E[Ri,1/2] = 0. As a result, we give the following theorem.

Theorem 1. Let

˜

xi=x−{

(1)i+1E[R\i,1]−x

2E[R\2i,1/2] }

, whereE[R\ki,j] isE[Rki,j] with replacing2 by ∆c2. Fori= 1,2,

P (

(1)i+1W−ubi

√vˆ <(1)i+1x˜i

xΠi

)

= Φ((1)i+1x) +O3/2. Explicit formula ofE[Ri,1] andE[R2i,1/2] can be derived, which are given in Appendix.

Based on the expansion, we set cutoff pointci as ci=

ˆ v [

(1)i+1zα {

(1)i+1E[R\i,1](1)i+1zα

2 E[R\2i,1/2] }]

+ ˆui (i= 1,2),

wherezαis theαpercentile point of the standard normal distribution. The cutoff pointc1(c2) makes the desired misclassification probability to be α within the error O3/2. The other misclassification probabilities can be described as

P(W > c1|xΠ2) =E [

Φ

((c1cu2) +U2−uc2

√V

)]

, P(W < c2|xΠ1) =E

[ Φ

((c2cu1) +U1+cu1

√V

)]

, respectively. Note that

ˆ

v/V p 1, U1+ ˆu1

p 0, U2−uˆ2

p 0, E[Ri,j/2] =Oj/2 (i, j= 1,2).

From the expansion, we have c

u1cu2= n m−1

{ N p N1N2

m−1 n

u1B1u1

p N p N1N2

}

=ω2(1 +ω22) n

m+ 1 n

m−1ω2+O1/2. It will be found that

c

u1cu2 n

m2p 0 under the asymptotic framework A1. In addition,

ˆ v− n3

m3ω2(1 +ω22)p 0.

Combining these results,

limA1 P(W > c1|xΠ2) = lim

A1P(W < c2|xΠ1) = Φ (

z1αlim

A1

m n

2

2+ω2 )

.

(8)

A Proof of Lemma 1

Proof of Lemma1. LetΓbe a orthogonal matrix of orderpwhich the first row is proportional toδ, and letB=ΓAΓ andwi =Γvi,i= 1,2. ThenB∼Wp(n,Ip),w1∼Np(∆e1,Ip),w2∼Np(0,Ip) andw1,w2 andB are independent;

S= (Γδ)(ΓAΓ)1(Γv1)= ∆eD 1B1w1, T= (Γv2)(ΓAΓ)1(Γv1)=Dw2B1w1=D

w1B2w1Z U = (Γv1)(ΓAΓ)1(Γv1)=Dw1B1w1,

V = (Γv1)(ΓAΓ)2(Γv1)=D w1B2w1,

where ei denotes fundamental vector with 1 in i-th position, Z N(0,1), and Z and {B,w1} are independent. By using reflection matrix(Householder matrix)Hbetweene1 and (1/

w1w1)w1, S = ∆D

w1w1(He1)(HBH)1{H(1/

w1w1)w1}= ∆w1(HBH)1e1. Besides,

U =D w1B1w1=w1w1·e1(HBH)1e1, V =D w1B2w1=w1w1·e1(HBH)2e1. Givenw1, CHBH ∼Wp(n,Ip), soC andw1 are independent. Partition

C=

(c11 c21 c21 C22

)

and w1= (w11

w21

) . It can be expressed that

S= ∆wD 1C1e1= ∆ c11·2

(w11w21C221c21), wherec11·2=c11c21C221c21. In addition,

U =D w1w1·e1C1e1= w1w1 c11·2

, V =D w1w1·e1C2e1= w1w1

c211·2 (1 +c21C221c21).

It is noted thatxC221/2c21∼Np1(0,Ip),DC22∼Wp1(n,Ip1), andxandD are indepen- dent, thusw11,w21,x,D andc11·2 are independent. Using these results, we have

S =Dc11·2

(w11w21D1/2x) and V =D w11w11

c211·2 (1 +xD1x).

LetG be orthogonal matrix of order p−1 which the first row is proportional to xD1/2. Givenx and D, y Gw21 Np1(0,Ip1), and it is found that w11, c11·2, x, D and y are independent.

Partitioningy= (y1y2), we have S=D

c11·2{w11(Gw21)(GD1/2x)}=Dc11·2

(w11

xD1xy1), U =D w112 + (Gw21)(Gw21)

c11·2

=D 1 c11·2

(w211+y12+y2y2), V =D w211+ (Gw21)(Gw21)

c211·2 (1 +xD1x)=D 1

c211·2(1 +xD1x)(w211+y21+y2y2).

These show the conclusion of the lemma.

(9)

B Expectations

Firstly, we calculateE[R2i,1/2]. It is that E[R2i,1/2] = x2

4 E[ ˜Q22,1/2] +E[ ˜Ui,1/22 ](1)i+1xE[ ˜Q2,1/2·U˜i,1/2].

SinceS1/2⊥⊥T1/2⊥⊥W1, E[Q22,1/2] = n4

f14 [(

1 + f2

f3 )2

E[S1/22 ] + (f2

f3 )2

s20E[T1/22 ] + 4· 2 f1

( 1 + f2

f3 )2

s20E[W12] ]

. Noting that

S1/22 = 4ω22

p Z12+ 2 f4

W42+ 22ω

√p

√2 f4

Z1W4, it holds that

E[S1/22 ] = 4ω22 p + 2

f4. It also holds that

E[T1/22 ] = 2 f2

E[W22] + 2 f3

E[W32] = 2 f2

+ 2 f3

. Thus,

E[Q22,1/2] = n4 f14

[(

1 +f2

f3 )2(

4ω22 p + 2

f4 )

+ (f2

f3 )2

(1 +ω22)2 (2

f2 + 2 f3

)

+8 f1

( 1 + f2

f3

)2

(1 +ω22)2 ]

. For evaluatingE[Ui,1/22 ], note that

Q21,1/2=n2 f12

(

S1/22 + 2

f1s20W122

√2

f1s0S1/2W1

) . Thus,

E[Q21,1/2] = n2 f12

[(4ω22 p + 2

f4

) + 2

f1

s20 ]

= n2 f12

[2 f4

+ 2 f1

+ (4

p+ 2 f1

) ω22

] . Moreover, the following equalities hold.

E[B22,1/2] = n2 f12s0

( 1 +f2

f3 )1

p =n2 f12

( 1 + f2

f3 )

(1 +ω22)1 p, E[B21,1/2] = n2

f12 {(1

p+f2 f3

1 p

)

ω2+ 2 f1

ω24 }

. From independence,

E[B1,1/2·B2,1/2] =E[B1,1/2]·E[B2,1/2] = 0, E[Q1,1/2·B2,1/2] =E[Q1,1/2]·E[B2,1/2] = 0.

Referensi

Dokumen terkait