LAMPIRAN - LAPORAN KEMAJUAN PENELITIAN DOKTOR BARU DANA ITS 2020

Lampiran berisi tabel daftar luaran (Format sesuai lampiran 1) dan bukti pendukung luaran wajib dan luaran tambahan (jika ada) sesuai dengan target capaian yang dijanjikan

LAMPIRAN 1 Tabel Daftar Luaran

Program : Penelitian Doktor Baru

Nama Ketua Tim : Dr. Muhammad Ahsan, S.Si.

Judul : Performa Diagram Kontrol Multivariat Dynamic Hotelling's T² dalam Memonitor Mixed Quality Characteristics dengan Principal Component Analysis (PCA)

1.Artikel Jurnal

No Judul Artikel Nama Jurnal Status Kemajuan*)

1. On the Performance of T²-Based

*) Status kemajuan: Persiapan, submitted, under review, accepted, published 2. Artikel Konferensi

No Judul Artikel Nama Konferensi (Nama

Penyelenggara, Tempat, Tanggal)

Status Kemajuan*)

1. Monitoring the Variability of Cement Compressive Strength

*) Status kemajuan: Persiapan, submitted, under review, accepted, presented

On the Performance of T

-Based PCA Mix Control Chart with KDE Control Limit for Monitoring Variable and Attribute Characteristics

Muhammad Ahsan^1*, Muhammad Mashuri¹, Hidayatul Khusna¹, Wibawati¹, and Dedy Dwi Prastyo¹

1Deparment of Statistics, Institut Teknologi Sepuluh Nopember

*corresponding author: muh.ahsan@its.ac.id

Abstract

This paper presents the detailed performance evaluation of the mixed multivariate T² control chart based on PCA Mix. The Kernel Density method is used in estimating the control limit of the proposed control chart. Using simulation studies, the performance evaluation is conducted to see the ability of the proposed chart in detecting outlier as well as detecting a shift in the process. In detecting the mixed outlier, the proposed chart has stable performance in detecting around 30 percent outlier. For the balanced proportion of attribute characteristics, the misdetection happens due to the high false alarm. Meanwhile, for the imbalanced and extreme imbalanced proportion, the masking effect is the main problem for this chart. In detecting the shift in the process, the proposed chart demonstrated better performance for the shift in variable and attribute characteristics. On the other hand, for the small shift, the proposed chart has a better performance for the shift in variable characteristics. This chart has a better performance in monitoring the small correlation in variable characteristics. When the proposed chart is applied to monitor the real case, its performance surpasses the conventional charts in terms of high accuracy and low false alarm.

Keyword: PCA Mix, Hotelling’s T², Kernel Density Estimation, Outlier, Mixed Quality Characteristics.

1. Introduction

Statistical Process Control (SPC) plays an important role in monitoring, maintaining, and improving the quality of a product. Control chart, which is the part of SPC, is one of the tools that is often used in monitoring not only the quality of products but also the quality of services provided by a company (Montgomery, 2009). Based on the number of monitored quality characteristics, the control charts are divided into two types, i.e. univariate and multivariate control charts. The univariate control charts are used to monitor only one quality characteristic, while the multivariate control charts are applied to monitor more than one quality characteristics.

In the current industrial era 4.0, it is hoped that a process can not only be monitored from one type of quality characteristic. For example, in monitoring the variable characteristics (in a

employed (Sorooshian, 2013). In the manufacturing process, monitoring a mixed quality characteristic is important (Pu, Li, & Xiang, 2011). However, in the past, the monitoring procedure for mixed quality characteristics are commonly conducted using individual way. The inefficiency will happen due to the need for calculating two statistics and control limits. Consequently, the administrator will have hardship in determining the monitoring result if the two procedures yield a different result. Therefore, a new concept of monitoring mixed characteristics is urgently needed.

To overcome this issue, Ahsan, et al., (2018) proposed a new monitoring procedure based on PCA Mix algorithm. This work also extended to detecting outlier for various number of a contaminated outlier (Ahsan, Mashuri, Kuswanto, Prastyo, & Khusna, 2019b). In this method, the T² statistics is used to form the control chart. Meanwhile, due to the unknown distribution, the control limit of PCA Mix chart is estimated using the kernel density, a non-parametric method to estimate the empirical density from the unknown distribution (Phaladiganon, Kim, Chen, & Jiang, 2013). However, in this work, the performance of the PCA Mix chart is only evaluated for one categorical data or attribute characteristics in detecting outlier. Also, the performance of the PCA Mix chart in detecting a shift in the process is monitored for both variable and attribute characteristics. As a result, there is no recommendation for what shift this chart has optimum performance.

Based on those reasons, this work is proposed to evaluate in detail the performance of the PCA Mix chart for detecting outlier and shift in the process. Similar to the PCA Mix chart proposed by Ahsan, et al., (2018), the proposed chart is also employed the Kernel Density Estimation (KDE) in calculating the control limit. In detecting outlier, the proposed chart is evaluated for more than one attribute characteristic. On the other hand, in monitoring the shift in the process, the proposed chart is assessed for a different type of shift and correlation. Also, the application and performance comparison of the proposed chart in monitoring the real data are presented in this work.

The rest of this work is organized as follows: the related works of this research is reported in Section 2. Section 3 presented the charting procedures of the proposed method. The performance

2. Related Works

The development of multivariate variable control charts is focused on three types such as Hotelling’s T², Multivariate EWMA, and Multivariate CUSUM control charts. The recent development of those three types of multivariate charts is tabulated in Table 1.

Table 1 The recent development in Multivariate Variable Control Chart

References Method Highlight Haddad et al. (2019) Bivariate Hotelling’s T² charts

using bootstrap data excellent ability to detect anomaly in network compared to conventional T²chart Mehmood, Lee, Riaz, Zaman,

MEWMA-CoDa control chart Proposed control chart procedure can handle measurement errors to detect shifts in the process

Haq and Khoo (2019) Adaptive MEWMA chart Proposed chart surpasses the performances of the existing

Flury and Quaglino (2018) MEWMA chart for

Haq, Munir, and Shah (2020) Dual MCUSUM charts with auxiliary information for detecting different sizes of shift in the process mean vector

Khusna et al. (2019) Residual-based Max

MCUSUM for autocorrelated processes

Proposed approach yields a more sensitive detection in mean than variance

Haq (2018) Weighted adaptive

MCUSUM charts

Proposed charts perform better than the conventional MCUSUM in detecting shift in mean

Meanwhile, the attribute charts recent works are reported in Table 2. From the table, it can be seen that the recent development is focused on the fuzzy, Poisson, and Multinomial attribute charts.

Table 2 The recent development in Attribute Control Chart

References Method Highlight

Mashuri, Wibawati, Purhadi, and Irhamah (2020)

Fuzzy bivariate chart Proposed chart is more sensitive than conventional bivariate Poisson chart

Zhou, Liu, and Zheng (2020) Synthetic control chart for attribute inspection Aldosari, Aslam, Srinivasa Attribute control chart for Proposed method has a better

Chong, Khoo, Haridy, and Shamsuzzaman (2019)

Multi-attribute CUSUM-np chart

Proposed procedure has a better or equal performance compared with conventional chart.

Aslam (2019) Attribute control chart using the repetitive sampling under neutrosophic system

Proposed chart with repetitive sampling under neutrosophic system is more sensitive in detecting a shift in the process as compared with the existing chart

Wibawati, et al. (2018) Fuzzy multinomial control chart

Aslam, Nazir, and Jun (2015) Attribute control chart using multiple dependent state sampling

Proposed approach has a better performance than the conventional np chart

Table 3 The recent development in Mixed control chart

References Method Highlight percentage of outlier added compared to the conventional

Multivariate sign chart Simulations show the superiority of proposed control chart in monitoring mixed type data

Furthermore, the recent development of the mixed control chart is presented in Table 3. For this area, it can be seen that there are a few works studied the mixed monitoring variable and attribute characteristics. Therefore, the more development for this area is needed. This research is trying to develop and assed the performance of the Mixed type chart, especially PCA Mix control chart, in order to bring a contribution for the monitoring process procedure.

3. Charting Procedures of PCA Mix Chart

This section discusses the procedures to form the multivariate control chart based on PCA Mix. Figure 1 illustrates the procedures in creating the multivariate control chart based on PCA Mix. In the procedure, there are three main steps. The first step is calculating the PCs using PCA Mix from the mixed characteristics. The second step is calculating the T² statistics from some principal components. Finally, estimate the control limit of the proposed chart using KDE. The detailed PCA Mix control chart procedure can be found in Ahsan et al. (2018).

PCA Mix Control Chart’s procedures

Step 1 Input the variable data X1 and the attribute data X2.

Step 2 Calculate the principal component scores (PCs) mix, denoted as

Y

^mix, using the PCA Mix method from X1 and X2.

eigenvalue respects to the v-th PCs.

Step 4 Calculate the empirical density of T_i²statistics,

2 2

optimum bandwidth calculated using Botev, Grotowski, & Kroese (2010) algorithm.

Step 5 Calculate the distribution function T_i²statistics,

Figure 1 PCA Mix control chart procedures

4. Performance for Detecting Outlier

This section shows the performance of the proposed chart in detecting outlier mixed with the in-control data. In assessing its performance, the simulation studies with some scenarios are conducted. For the simulations, the variable characteristics are assumed to follow the Multivariate Normal Distribution X₁~N_p(0, I), while the attribute characteristics are generated to follow the Multinomial Distribution with three categories X₂ M( ,  ₁ ₂, ₃). Similar to the Ahsan et al.

(2019), the attribute characteristics are differentiated into three types such as the almost balanced proportion ( ₁, ₂ =0.3 and ₃ =0.4 ), imbalanced proportion ( ₁, ₂ =0.1 and ₃ =0.8 ), and extreme imbalanced proportion ( ₁, ₂ =0.05 and ₃=0.9).

For the detailed performance, the number of attribute characteristics is evaluated for 2, 3, and 5. On the other hand, the number of variable characteristics is 5. The outliers mixed with the clean data are set to 5, 10, 20, 30, 40, and 50 percent out of the total observations. There are three

misdetection with criteria of the lower the better. The detailed algorithm for this simulation studies can be found in Ahsan et al. (2019).

4.1 Two attribute characteristics

Table 4 shows the performance of the proposed chart in detecting outlier for two attribute characteristics with  ₁, ₂ =0.3 and ₃ =0.4. In general the proposed chart still has a stable performance for no more than 30 percent outlier added to the clean data. For this case, it can be seen that the misdetection occurs due to a large number of the in-control data declared as an outlier (high FP rate). The proposed chart performance in detecting outlier for two attribute characteristics with imbalanced proportion is reported in Table 5. Different from the previous case (two variable with balanced proportion), the misdetections are caused by the inability of the control chart to capture the actual outliers which can be seen from the high FN rate. Furthermore, Table 6 presents the performance of the proposed chart to detect outlier for the extreme imbalanced proportion ( ₁, ₂ =0.05 dan ₃=0.9). For this condition, it can be seen that the high value of the FN rate causes a low level of accuracy of the proposed chart. In general, for this case, using the number of component l=2 produces better results.

Table 4 Performance of the proposed chart in detecting outlier for two attribute characteristics with  ₁, ₂ =0.3 and ₃ =0.4

ε=5% ε=10%

Hit Rate FN Rate FP Rate Hit Rate FN Rate FP Rate p=5, l =2 0.96966 0.0000 0.0319 0.95122 0.0002 0.0542 p=5, l =3 0.97353 0.0000 0.0279 0.95445 0.0005 0.0506 p=5, l =4 0.96633 0.0000 0.0355 0.94993 0.0001 0.0556

ε=20% ε=30%

Hit Rate FN Rate FP Rate Hit Rate FN Rate FP Rate p=5, l =2 0.86096 0.0020 0.1733 0.78931 0.0216 0.2917 p=5, l =3 0.90388 0.0033 0.1193 0.78449 0.0204 0.2991 p=5, l =4 0.88373 0.0034 0.1445 0.79002 0.0230 0.2901

ε=40% ε=50%

Table 5 Performance of the proposed chart in detecting outlier for two attribute characteristics

Table 6 Performance of the proposed chart in detecting outlier for two attribute characteristics with  ₁, ₂ =0.05 and ₃ =0.9.

4.2 Three attribute characteristics

The performance of the proposed chart in detection outlier in the case of three attribute characteristics with  ₁, ₂ =0.3 and ₃ =0, 4(balanced) is presented in Table 7. Similar to the two attribute characteristics case, the misdetection for this case happens due to the high false alarm produced represented by the high value of FP rate. Table 8 and 9 show the performance for three attribute characteristics with imbalanced and extreme imbalanced proportion, respectively. In this case, it can be seen that the misdetection for these two cases happens due to the actual outliers are failed to be detected represented by the high value of FN rate. From this case, it also can be seen that using smaller principal components produces better results. The performance degradation can be seen when the proposed chart monitors more than 30 percent outliers. Also, the more imbalanced proportion of the attribute characteristics the higher accuracy level produced.

Table 7 Performance of the proposed chart in detecting outlier for three attribute characteristics with  ₁, ₂ =0.3 and ₃ =0.4

ε=5% ε=10%

Hit Rate FN Rate FP Rate Hit Rate FN Rate FP Rate p=5, l =2 0.96961 0.0004 0.0320 0.95411 0.0001 0.0510 p=5, l =3 0.94098 0.0000 0.0621 0.92895 0.0003 0.0789 p=5, l =4 0.94272 0.0000 0.0603 0.91451 0.0000 0.0950

ε=20% ε=30%

Hit Rate FN Rate FP Rate Hit Rate FN Rate FP Rate p=5, l =2 0.90321 0.0030 0.1202 0.77372 0.0193 0.3150 p=5, l =3 0.81657 0.0005 0.2292 0.71108 0.0106 0.4082 p=5, l =4 0.81361 0.0010 0.2327 0.70425 0.0111 0.4177

ε=40% ε=50%

Hit Rate FN Rate FP Rate Hit Rate FN Rate FP Rate p=5, l =2 0.62677 0.0711 0.5746 0.49915 0.2422 0.7595 p=5, l =3 0.60148 0.0522 0.6294 0.50158 0.1654 0.8314 p=5, l =4 0.60654 0.0587 0.6167 0.49921 0.2002 0.8014

Table 8 Performance of the proposed chart in detecting outlier for three attribute characteristics

Table 9 Performance of the proposed chart in detecting outlier for three attribute characteristics with  ₁, ₂ =0.05 and ₃ =0.9.

4.4 Five attribute characteristics

Table 10 shows the outlier monitoring results for five attribute data with

1, 2 0.3 and 3 0.4.

  =  = According to the simulation results, it can be concluded that, in this case, the misdetection occurs due to a large number of the in-control data declared as an outlier (see FP rate). The performances of the proposed chart for  ₁, ₂=0.1 and ₃=0.8 as well as

1, 2 0.05 and 3 0.9

  =  = are reported in Table 11 and 12, respectively. Similar to the two previous cases, the failure to detect the actual outliers leads to reduced accuracy given by the proposed chart.

In general, the usage of the smaller principal component leads to higher accuracy. This chart still at its peak performance for less than 40 percent outlier mixed. Moreover, the more imbalanced proportion of the attribute characteristics monitored by the proposed chart the higher Hit rate or accuracy produced.

Table 10 Performance of the proposed chart in detecting outlier for five attribute characteristics with  ₁, ₂ =0.3 and ₃ =0.4

ε=5% ε=10%

Hit Rate FN Rate FP Rate Hit Rate FN Rate FP Rate p=5, l=2 0.99097 0.0010 0.0095 0.98861 0.0035 0.0123 p=5, l =3 0.98939 0.0024 0.0110 0.98264 0.0040 0.0188 p=5, l =4 0.98968 0.0016 0.0108 0.98590 0.0051 0.0151

ε=20% ε=30%

Hit Rate FN Rate FP Rate Hit Rate FN Rate FP Rate p=5, l=2 0.96411 0.0226 0.0394 0.89619 0.0991 0.1058 p=5, l =3 0.95652 0.0252 0.0480 0.87571 0.0956 0.1366 p=5, l =4 0.95204 0.0249 0.0537 0.87924 0.1079 0.1263

ε=40% ε=50%

Hit Rate FN Rate FP Rate Hit Rate FN Rate FP Rate p=5, l=2 0.73821 0.2665 0.2587 0.50111 0.5183 0.4794 p=5, l =3 0.72134 0.2432 0.3023 0.49897 0.5168 0.4852 p=5, l =4 0.72961 0.2916 0.2562 0.49995 0.5059 0.4942

Table 11 Performance of the proposed chart in detecting outlier for five attribute characteristics

Table 12 Performance of the proposed chart in detecting outlier for five attribute characteristics with  ₁, ₂ =0.05 and ₃ =0.9.

5. Performance Evaluation in Monitoring Process Shift

In this section, the performance of the proposed chart is evaluated to monitor the shift in the process. Similar to the previous section, the variable characteristics is generated following the multivariate normal distribution and the attribute characteristics is generated following the multinomial distribution with three types of proportion as stated before. For this case, the performance of the proposed chart is evaluated for a different type of shift such as a shift in variable characteristics, shift in attribute characteristics, and shift in both variable and attribute characteristics. The performance of the proposed chart is also evaluated for a different type of correlation. Using the same approach as Ahsan et al. (2018), the ARL1 is estimated by shifting the variable characteristics by μ_shift = +μ δ_, where δ_ =0.1 and shifting the attribute characteristics by θ_shift =[    ₁− _; ₂− _; ₃+2_], where _ =0.0025.

Tabel 13 ARLs for  ₁, ₂=0.3 and ₃=0.4 with shift in the variable characteristics

Shift p=5 p=5 p=5

   _ l=2 l=3 l=4

0.0 0.0000 354.586 386.497 377.743 0.1 0.0000 123.905 166.972 173.247 0.2 0.0000 74.363 93.708 97.509 0.3 0.0000 53.117 62.026 64.519 0.4 0.0000 41.313 44.420 46.182 0.5 0.0000 33.801 33.216 34.512 0.6 0.0000 28.601 25.460 26.433 0.7 0.0000 24.788 19.772 20.509 0.8 0.0000 21.872 15.422 15.979 0.9 0.0000 19.569 11.988 12.402 1.0 0.0000 17.706 9.209 9.506 1.1 0.0000 16.166 6.912 7.115 1.2 0.0000 14.873 4.983 5.105 1.3 0.0000 13.771 3.340 3.394 1.4 0.0000 12.821 1.923 1.918 1.5 0.0000 12.394 1.286 1.255

Tabel 14 ARLs for  ₁, ₂ =0.1 and ₃=0.8 with shift in the variable characteristics

5.1 Shift in variable characteristics

Table 13-15 show the performance of the proposed chart with a shift in variable characteristics for the balanced, imbalanced, and extreme imbalanced proportion of attribute data, respectively. In general, using the KDE control limit, the proposed chart produces ARL0 at around 370 for the false alarm rate  =0.00273. For the shift only in variable characteristics, the proposed chart can capture the change in the process by producing the lower ARL1 for the larger shift given.

For this case, better performance is achieved when it is used to monitor the balanced parameter of the attribute characteristics. balanced, imbalanced, and extreme imbalanced proportion parameter are sequentially presented in Table 16-18. For this case, using the KDE control limit, it can be found that the performance of

component l used, the proposed chart demonstrated a better performance for the extreme

5.3 Shift in variable and attribute characteristics

This subsection presents the performance of the proposed chart for detecting the shift in both variable and attribute characteristics. Table 19 reports the performance of the proposed chart for balanced situation of attribute characteristics. Meanwhile, the imbalanced and extreme imbalanced performances of the proposed chart are presented in Table 20 and 21, respectively.

From the results, it can be seen that there are similarity performance with the performance of the proposed chart when it is used to monitor shift in variable characteristics. The main difference of the performance lies on the type of shift. For small shift, the proposed chart has a better performance in monitoring the shift in only variable characteristics. On the other hand, for large shift, the shift in both variable and attribute characteristics are producing a better performance.

Tabel 19 ARLs for  ₁, ₂=0.3 and ₃=0.4 with shift in both variable and attribute characteristics

Shift p=5 p=5 p=5

   _ l=2 l=3 l=4

0.0 0.0000 383.134 358.421 355.567 0.1 0.0025 333.432 327.743 340.012 0.2 0.0050 197.665 254.887 268.843 0.3 0.0075 19.332 265.425 252.876 0.4 0.0100 7.265 311.954 270.834 0.5 0.0125 5.123 176.021 186.598 0.6 0.0150 3.812 102.912 130.143 0.7 0.0175 3.032 56.765 85.722 0.8 0.0200 2.423 34.132 51.918 0.9 0.0225 1.976 19.932 34.764 1.0 0.0250 1.723 13.621 22.823 1.1 0.0275 1.551 8.754 14.921 1.2 0.0300 1.332 6.523 10.543 1.3 0.0325 1.281 4.821 7.222 1.4 0.0350 1.221 3.525 5.616 1.5 0.0375 1.108 2.732 4.023

Table 20 ARLs for  ₁, ₂ =0.1 and ₃ =0.8 with shift in both variable and attribute

5.4 Different Coefficient Correlation

This subsection presents the performance of the proposed chart for several coefficient correlation. In evaluating the performance of the proposed chart, the variable characteristics is generated with four type of correlation such as: 0.3, 0.5, 0.7, and 0.9 using the KDE control limit.

For this case, the process is shifted for both variable and attribute characteristics. The number of variable characteristics p is 5 with the number of principal components used l is 4. Also, the proposed chart is evaluated for three types of attribute characteristics as declared in the previous section.

Table 22 shows the performance of the proposed chart for monitoring the balanced proportion of attribute characteristics ( ₁, ₂=0.3 and ₃=0.4) with several types of correlation.

For the in-control condition, the proposed chart always produce the ARL0 at about 370 for all scenarios. When process is shifted, the proposed chart can detect shift by producing the smaller ARL1. For this case, the better performance is happened when the proposed chart monitors the process with smaller coefficient correlation.

Table 23 and 24 report the performance of the proposed chart in monitoring the imbalanced and the extreme imbalanced proportion of the attribute characteristics, respectively. According to the tables, it can be concluded that for the in-control condition, the proposed chart produces the appropriate ARL0 (around 370 for =0.00273). Similar to the previous result, the smaller coefficient correlation produces the better performance which can be seen from the ARL1 value for each scenario. In addition, the proposed chart reaches its peak performance when it is used in monitoring data in a balanced proportion of attribute characteristics.

Table 22 ARLs of the proposed chart with p=5, l=4,  ₁, ₂=0.3 and ₃=0.4for various

Table 24 ARLs of the proposed chart with p=5, l=4,  ₁, ₂=0.05 and ₃=0.9 for various correlation

Shift Correlation

   _ 0.3 0.5 0.7 0.9

0.0 0.0000 357.967 383.537 370.886 377.704

0.1 0.0025 243.622 270.000 275.799 289.455

0.2 0.0050 149.216 158.257 168.138 181.141

0.3 0.0075 81.986 93.388 102.324 104.422

0.4 0.0100 49.656 54.899 59.872 54.953

0.5 0.0125 30.627 32.935 39.414 35.261

0.6 0.0150 22.166 22.090 21.672 24.862

0.7 0.0175 18.668 19.888 22.112 20.365

0.8 0.0200 18.578 18.370 19.078 18.643

0.9 0.0225 18.380 19.296 19.053 18.469

1.0 0.0250 19.638 20.959 19.975 20.525

1.1 0.0275 22.159 21.430 22.065 22.382

1.2 0.0300 24.402 24.746 24.343 24.081

1.3 0.0325 28.004 27.539 29.145 26.703

1.4 0.0350 31.852 33.243 32.269 32.048

1.5 0.0375 37.362 37.036 41.362 40.535

6. Applications in the real cases

In this part, the proposed chart is applied to the real cases. Two datasets are used as the benchmark, namely machine failure dataset and NSL-KDD dataset. In order to see the superiority of the proposed chart, the proposed chart is also compared with the conventional Hotelling’s T²

Dalam dokumen LAPORAN KEMAJUAN PENELITIAN DOKTOR BARU DANA ITS 2020 (Halaman 20-59)