• Tidak ada hasil yang ditemukan

Learning Strength and Weakness Rules in the Presence of External FactorsExternal Factors

Mining Strength and Weakness Rules of Cricket Players in the Presence of External

6.2 Learning Strength and Weakness Rules in the Presence of External FactorsExternal Factors

In this section, we propose a computational method that identifies the strength and weakness rules of the individual player in the presence of external features. In Section 4.2, the rule learner (CA method) finds directions in an unconstrained manner. CA’s objective is to minimize the sum of the squared X2 distance under no constraints. In the presence of external features, the objective is to minimize thisX2 distance along the weighted linear combination of external features. By doing so, it not only achieves the dependency between batting and bowling features as given in Definition 4.1 - Definition 4.5, but also constrains the search space along the weighted linear combination of external features as given in Definition 6.1 - Definition 6.5. Redundancy Analysis (RDA) [144]

and Canonical Correspondence Analysis (CCA) [28] methods are used for this task. RDA is an extension of PCA to include external features. Similarly, CCA is the extension of CA include the external features. As we have previously ruled out PCA for our analysis (batting features and bowling features are discrete random variables), we can not use RDA. CCA is employed to incorporate external features to refine rule learning. CCA projects the data onto a subspace defined by the external features and performs CA. The steps of CCA are presented separately for batting analysis and bowling analysis in the following sections.

6.2.1 Batting Analysis through CCA

Refer to Figure 6.1 for the steps involved in batting analysis through CCA. For batting analysis, the relationships between batting features and external features are obtained through the bowling features. To achieve this, TCM of bowling features×batting features (denoted asN) and ECM of bowling features×external features (denoted asX) are constructed. CCA first obtains the residual matrixAfromN. The projection matrixQis obtained fromX. ThenAis projected ontoQto get the projected matrixA(=QA), which is the interpretable part ofXby the external features. SVD is applied toA to obtain the bowling principal components (F), batting principal components (G)

Xn x e Bowling Features

External Features

ECM

Input Data: External and Technical Confrontation Matrix (ECM and TCM)

Nn x m Bowling Features

Batting Features

TCM Constrained Matrix

Qn x n Projection Matrix

An x m Bowling Features

Batting Features

Residual Matrix

A*n x m Bowling Features

Batting Features

Projected Matrix (A* = QA)

Singular Value Decomposition r

U

n

Left Singular Vectors

Σ

r

Diagonals of Singular Values

r r VT

m

Right Singular Vector

Principal Components (PCs)

Batting Features

First Two Column PCs

PC1 PC2

Gm x 2

Eex 2

Batting Features Bowling Features

First Two Row PCs

PC1 PC2

Fnx 2

Fn x r

Row PCs F = UΣ Bowling Features

PC1PC2PC3

Column PCs G = VΣ Gm x r PC1PC2PC3

Regression Coefficients (RCs)

External Features (e)

RCs of the CCA dimensions on External Features

E

RC1RC2RC3

External Features

RCs of First Two CCA dimensions on External Features

RC1RC2

Regression Coefficient

Figure 6.1: Batting Analysis through CCA.

Algorithm 8 CCA Algorithm (Batting Analysis)

Require: T CMbowl×bat (NI×J) and ECMbowl×ext (XI×M)

1: Matrix sum: n=PIi=1PJj=1Nij 2: Row masses(r): ri= Nni., i= 1,2,· · · , I

3: Diagonal matrix: Dr=diag(r1, r2, ..., rI)

4: Column masses(c): cj = Nn.j, j = 1,2,· · · , J

5: Diagonal matrix: Dc=diag(c1, c2, ..., cJ)

6: Correspondence matrix: P = 1nN

7: Standardized residuals: A=D

1

r 2(PrcT)D

1

c 2

8: I×I projection matrix: Q=D

1

r 2X(XTDrX)−1D

1

r 2

9: Project A onto Q to obtain constrained correspondence matrix: A=QA

10: Singular value decomposition: A =UΣVT

11: Principal components of rows: G=D

1

r 2UΣ

12: Principal components of columns: F =D

1

c 2VΣ

13: return F and G

in the constrained space of external features. Additionally, coordinates for the external features (E) are obtained, which are the standardized regression coefficients obtained by performing the weighted least square regression of the external features on the two principal axes. The CCA algorithm for batting analysis is presented in Algorithm 8. To plot all the three features in a two- dimensional plot, triplot [28] is used, where bowling features and batting features are represented by points and external features are represented by arrows. For the bowling features and batting features, the biplot interpretation holds. That is, the closer they are, the more dependent they are.

The relationship between the batting features and external features is through the bowling features they have in common.

6.2.2 Bowling Analysis through CCA

Refer to Figure 6.2 for the steps involved in bowling analysis through CCA. For bowling analysis, the relationships between bowling features and external features are obtained through the batting features. To achieve this, TCM of batting features×bowling features (denoted as N) and ECM of batting features×external features (denoted asX) are constructed. CCA first obtains the residual matrixAfromN. The projection matrixQis obtained fromX. ThenAis projected ontoQto get the projected matrixA(=QA), which is the interpretable part ofXby the external features. SVD is applied toA to obtain the batting principal components (F), bowling principal components (G) in the constrained space of external features. Additionally, coordinates for the external features (E) are obtained. The coordinates for the external features (E) are the standardized regression coefficients obtained by performing weighted least square regression of the external features on the two principal axes. The CCA algorithm for bowling analysis is presented in Algorithm 9. To plot all the three features in a two-dimensional plot, triplot [28] is used, where bowling features and batting features are represented by points and external features are represented by arrows. For the

Xm x e Batting Features

External Features

ECM

Input Data: External and Technical Confrontation Matrix (ECM and TCM)

Nm x n Batting Features

Bowling Features

TCM Constrained Matrix

Qm x m

Projection Matrix

Am x n Batting Features

Bowling Features

Residual Matrix

A*m x n Batting Features

Bowling Features

Projected Matrix (A* = QA)

Singular Value Decomposition r

U

m

Left Singular Vectors

Σ

r

Diagonals of Singular Values

r r VT

n

Right Singular Vector

Principal Components (PCs)

Bowling Features First Two Column PCs

PC1 PC2

Gnx 2

Eex 2

Bowling Features BattingFeatures

First Two Row PCs

PC1 PC2

Fmx 2

Fm x r

Row PCs F = UΣ

BattingFeatures

PC1PC2PC3

Column PCs G = VΣ

Gn x r

PC1PC2PC3

Regression Coefficients (RCs)

External Features (e)

RCs of the CCA dimensions on External Features

E

RC1RC2RC3

External Features

RCs of First Two CCA dimensions on External Features

RC1RC2

Regression Coefficient

Figure 6.2: Bowling Analysis through CCA.

Algorithm 9 CCA Algorithm (Bowling Analysis) Require: T CMbat×bowl (NI×J) and ECMbat×ext (XI×M)

1: Matrix sum: n=PIi=1PJj=1Nij 2: Row masses(r): ri= Nni., i= 1,2,· · · , I

3: Diagonal matrix: Dr=diag(r1, r2, ..., rI)

4: Column masses(c): cj = Nn.j, j = 1,2,· · · , J

5: Diagonal matrix: Dc=diag(c1, c2, ..., cJ)

6: Correspondence matrix: P = 1nN

7: Standardized residuals: A=D

1

r 2(PrcT)D

1

c 2

8: I×I projection matrix: Q=D

1

r 2X(XTDrX)−1D

1

r 2

9: Project A onto Q to obtain constrained correspondence matrix: A=QA

10: Singular value decomposition: A =UΣVT

11: Principal components of rows: G=D

1

r 2UΣ

12: Principal components of columns: F =D

1

c 2VΣ

13: return F and G

bowling features and batting features, the biplot interpretation holds. That is, the closer they are, the more dependent they are. The relationship between the bowling features and external features is through the batting features they have in common.