CHAPTER 2 LITERATURE REVIEW
2.2 D ISTANCE BASED MODELS
2.2.1 Multivariate statistical process control models
In order to detect and predict unknown faults, the simplest way is Shewhart multivariate SPC charts (S. Bersimis, S. Psarakis, & J. Panaretos, 2007). Assume that the collected sensor signals are independently and identically distributed π ~ Nm(π0, Ξ£0) where π0 and Ξ£0 are the hypothesized mean vector and variance-covariance matrix respectively, then Hotellingβs T2 statistics are popularly monitored for controlling a mean value of the given multiple sensor signals.
ο¬ Hotellingβs T2 statistic: It is one of Mahalanobis distances for detecting a relatively large change in one or several of the components of the current mean vector. Therefore, it measures between the current mean vector and the hypothesized mean vector, as the following equation. If the ππ2 is larger than the predefined upper control limit, then it is considered to be fault detected (J. Park & C. Jeon, 2012).
ππ2 = (π±πβ π0)πΞ£0β1(π±πβ π0) ~ ππ2 (πΉππ πππππ π) Assumption
ο¬ ππ£ is serially independent (i.e., not dependent on the time scale)
ο¬ ππ£ follows a multinomial distribution ~ ππ(π0, π΄0)
Figure 2.2 The general structure of 'the distance from the normal state' based approach
~ ππ,πβπ2 = π(π β 1)
π β π πΉ (πΉππ π ππππ π)
πππΆπΏ2 =π(π β 1)
π β π πΉπΌ,π,πβπ
where π±π is the jth time point vector of multivariate time series data matrix. If the given data have the large number of n, or are normally distributed, then Hotellingβs T2 is chi- square distributed approximately with m degree of freedom. Otherwise, for the small number of n, the approximated chi-square distribution cannot sufficiently take into account variation in the estimating population variance-covariance matrix, therefore it follows a scaled F-distribution with n and n-m degrees of freedom.
ο¬ |π|12: It is devised to control a process variability by detecting a change in one or several variances or correlations, where π is the current variance-covariance matrix. The control limits are determined as follows:
|π|ππΆπΏ
1
2 = (π3+ Lβπ1 β π32) Γ |Ξ£0|12
|π|πΏπΆπΏ
1
2 = (π3β Lβπ1 β π32) Γ |Ξ£0|12
where π1= (π β 1)βπβππ=1π β π, π3= ( 2
πβπ)
π 2 Γ Ξ(
π 2) Ξ(πβπ
2 ), and πΏ is a weight, typically set to 3. However, this type of a model is not usually guarantee a confidence region for a variance-covariance matrix.
ο¬ In addition to control charts for controlling the process variability, there are also other statistics (S. Bersimis, J. Panaretos, & S. Psarakis, 2005): (i) the determinant of the sample generalized variance-covariance matrix |π|, and (ii) the trace of the sample generalized variance-covariance matrix tr(π), where π is the sample generalized variance-covariance matrix.
These multivariate Shewhart SPC models provide detection and prediction results by calculating the distance based on the current measurementsβ deviation(s), they easily detect a relatively large shift. However, they are subject to negatively result in being insensitive to small/subtle incremental/decremented signalsβ changes in the target system (J. A. Romagnoli & A. Palazoglu,
2005). In order to employ some of previous measurements, multivariate cumulative sum (MCUSUM) and multivariate exponential weighted moving average control charts (MEWMA) are developed.
ο¬ MCUSUM: Univariate cumulative sum chart (CUSUM) is originally devised for change point detection. Simply speaking, a detection result is determined by how big the total cumulative sum of values is, which is out of the predefined thresholds (called as the upper and lower cumulative sums)
π‘π ππ= max(0, tskβ1+ π₯ππβ π€π)
where π₯ππ is a measurement for the ith sensor at the jth time point, π‘π π0 = 0 and π€π is assigned weights such as a likelihood function. For applying CUSUM from the perspective of multivariate analysis, there are two different ways of MCUSUM, respectively: (i) use multiple univariate CUSUM charts, and (ii) the L2-norm of a vector of cumulative sums as a test statistic of the MCUSUM. That is, for the first case, we can use the total sum or the average of π‘π ππ from the all the ith sensors. On the other hands, in the second case, π₯ππ will be replaced into |π±π£| = βπ₯1π2 + π₯2π2 + β― + π₯ππ2 + β― + π₯ππ2 .
For example, F. Attal, A. Boubezoul, L. Oukhellou, N. Cheifetz, and S. EspiΓ© (2014) wanted to early detect a riderβs fall from the motorcycle using the L2-norm type MCUSUM based on three accelerometer and three gyroscope sensor signals. The detection result from MCUSUM was accurate than using individual L2-norm chart in terms of both early detection time and detection rate.
ο¬ MEWMA: It is also proposed as an extension version of the univariate EWMA control chart.
π³π= ππ±π£+ (π β π)π³π’βπ
where R is a diagonal matrix for exponential smoothing of which elements are constrained from 0 to 1. An initial value of measurement (π³π) is usually set to a in- control mean vector of the target system. In addition, its variance-covariance matrix also can be evaluated (D. Moraes, F. Oliveira, & L. Duczmal, 2015). However, the charts of EWMA type can easily violate normality assumption than CUSUM type due to weighted mean (A. Cinar & C. Undey, 1999).
For monitoring a photovoltaic manufacturing process, the MEWMA chart with two sensor signals, in a form of time series, were applied, since their correlation is quite high (J. L. Coleman & J. Nickerson, 2005). They compared the MEWMA chart with other univariate charts (individual SPC and EWMA charts of the sensor signal). Whereas two univariate charts failed to detect a fault, the MEWMA chart succeeded in detecting fault, but three seconds after the fault occurrence.
MCUSUM and MEWMA are individually conducted for detecting two different damages in Z24 Bridge in Switzerland (J. Kullaa, 2003). Four natural frequency changes over time were analyzed, after analyzing mode pairing from four sensors respectively as a pre-processing step. Finally, he investigated that Multivariate SPC charts gave usually accurate detection results than univariate SPC charts, but not in the case of small damage detection.
Above traditional test statistics, such as mean and variance, do not always adequately account for the systemβs faulty states clearly. For example, P. Zantek, S. Li, and Y. Chen (2007) investigated the effect of test statistics in MEWMA for fault detection. They selected three different test statistics:
an estimated generalized least square, a mean value from four sensor signals and that from fifteen sensor signals respectively. A dataset was collected from a manufacturing process of two-panel-work- piece assembly, and as a result, MEWMA charts using the generalized least square gave a better detection result than charts using the other statistics.
Although several studies applied MCUSUM or MEWMA solely to detect/predict faults (D.
Jung, D. Kang, J. Liu, & K. Lansey, 2013). This chart types of CUSUM and EWMA usually shows inertia when a violation with an opposite direction against to previous signalsβ changes, thus it is not easy to react as fast as possible to the violation occurrences (F. Umit & A. Cigdem, 2000). Therefore, many researchers have paid much attention to investigate a new or novel feature(s) which is(are) appropriate to explain a systemβs operational state with simple relationships by analyzing time and frequency domain analysis.
Rather than original measurements, autoregressive measurements are alternatively applied to the MEWMA procedure (Z. Wang & K. C. G. Ong, 2009). After generating a revised measurement vector from autoregressive model and MEWMA, a final damage indicator is computed as the total number of outliers in the revised measurement vector. Multiple sensor signals were collected from a two-bay-and-two-story RC plane frame, and three faults with different defectiveness were successfully analyzed by the proposed index.
Similarly, A. Messaoud, C. Weihs, and F. Hering (2008) developed a new test statistics for the MEWMA chart. It was called as rank based MEWMA, since it used the sequential rank of a data- depth. The data-depth indicates how much central the measurement is against to the reference distribution in multivariate sensor signals. In this paper, they calculated the data-depth of the residual vectors over time and it is subsequently transformed into the sequential rank. The performance of rank based MEWMA was validated using the dataset acquired from a real drilling process.
Instead of MCUSUM or MEWMA, just simple control charts such as X-bar chart and an individual chart were also popularly employed after developing relevant test-statistics or features using the multivariate sensor signals. For example, the change point detection techniques based on linear trends was proposed (The real-time slope statistic profile method) to detect faults of a heating zone in a chemical reactor (T. Vafeiadis et al., 2016a; T. Vafeiadis et al., 2016b). They devised the test statistics (called as a parametric linear trend) for t-test which is computed by the estimated generalized least square divided by either auto-covariance or sample power spectrum. They detected negative and positive change points in two sensor signals respectively in the individual chart, then common points were determined as the fault detection points. The performance was verified with three datasets which contain a sudden and an abrupt change in each signal.
R. K. Singleton, E. G. Strangas, and A. Aviyente (2014) proposed the Z-value based on principal eigenvectors to monitor the systemβs state via temperature and vibration sensor signals.
Eigen vectors are computed using six extracted features from time and frequency domains (i.e., entropy from two different frequency bands, mean, variance, skewness, and concentration of the decomposed time series at a specific frequency domain, respectively). Since Z-value was designed for detecting change points in multiple sensor signals, they defined high Z-value occurrences as the systemβs state transition toward worse.
Energy ratio was devised using a normalized summation of residuals of an autoregressive model for a gearboxβs fault detection and prediction (X. Zhang, J. Kang, J. Zhao, & D. Cao, 2013).
They said that a fault will occur in the gearbox when the proposed index increases. Whereas the autoregressive model-based energy ratio when analyzing non-stationary time series data, the proposed index provided confused diagnosis results (e.g., detection and isolation), in the case of double fault occurrences in gear and bearing.
Similarly, F. Wang, Y. Zhang, B. Zhang, and W. Su (2011) employed an Wavelet packet sample entropy for predicting future fault occurrences in rolling element bearing system. First, Wavelet packet sample entropies were computed from the decomposed multiple vibration sensor signals at the pre-selected frequency band. Sample entropies were then extracted by empirical mode
decomposition and the time series in the highest order is transformed into Wavelet packet sample entropy again for final predicting the systemβs state. As the sample entropy decreased, the systemβs state became worse.
When multiple sensor signals have conflicting or partial information, a direct analysis can degrade the performance of fault detection and prediction. Therefore, several sensor data fusion methods have been paid attention with regard of how to combine relevant information in the conflicting or partial sensor signals at either data or feature levels for fault detection and prediction (M. Dong & D. He, 2007).
For example, Y. Wang, F. Chu, Y. He, and D. Guo (2004) collected four sensor data from an automobile, and converted each sensor data to a set of basic statistical features (e.g., oxygen data converted into maximum, minimum, and mean voltage, and ignition data extracted as a puncture, a spark, and a minimum voltage). After reducing the feature dimension into two axes individually by Karhunen-Loeve transform, Dempster-Shafer (D-S) evidence theory was applied sequentially to two sensor data (totally three times repeated) for detecting and diagnosing faults.
D-S evidence regression was also used for multi-step ahead prediction of the machinery systemβs states (G. Niu & B.-S. Yang, 2009). After refining each sensor signals based on time series reconstruction techniques individually, iterated D-S evidence regression consisted of three procedures for 100-step ahead prediction. A one-step ahead prediction is firstly conducted with a sliding time segment, then one-step ahead prediction is repetitively conducted 99 times using a mixture of actual and predicted measurements. Finally, a 100-step ahead measurement is predicted only using predicted measurements in first and second procedures.
Although each feature and method gave improved results, but there are still challenges in selecting appropriate regression types, features, and optimizing required parameters (H. Motulsky, 1995). That is, the performance of these control charts are guaranteed when the input sensor data follow the assumed particular distribution, in particular, for high dimensional time series data, it is not easy to guarantee (V. Chandola et al., 2009). For example, J. Coble and J. W. Hines (2009) investigated the optimal feature for predicting the systemβs healthy states with regard to monotonicity, prognosability, and trendability. They decided that the highest values of the proposed three metrics were the most appropriate features, and then applied the genetic algorithm (GA) based on a weighted sum of the three metrics to select the final optimal feature. However, they still depended on the userβs decision on how to determine the weight for GAβs fitness function.
However, it is not simple to generalize the special cases to the overall fault detection and
prediction problems. Since a model is usually dependent on the given historical data (J. Coble, P.
Ramuhalli, L. J. Bond, J. Hines, & B. Ipadhyaya, 2015), the special ones are usually too overfitting to given training dataset. M. Shewhart (1991) said that they used expert knowledge when selecting types of control charts. Therefore, M. M. Rahman, M. M. Islam, K. Murase, and X. Yao (2016) used ensemble method to analyze the optimum number of past measurements for predicting the future systemβs state. Since the objective function of optimization was a prediction result, we cannot be sure if the selected optimum number will be maintained after the prediction model changes. Therefore, as several studies have already developed for method/feature/parameter selection for specific methods (G.
Karakaya, S. Galelli, S. D. AhipaΕaoΔlu, & R. Taormina, 2016; S. Li & D. Wei, 2014; G. A.
Rovithakis, M. Maniadakis, & M. Zervakis, 2004; Z. Zhu, Y. S. Ong, & M. Dash, 2007), quantitative constraints or guidelines for applying the proposed method should be elaborately presented for next validation and verification. For example, GA is employed in order to search the global optimum parameter values of several EWMA control charts (F. Aparisi & J. Carlos GarcΔ±Μa-DΔ±ΜAz, 2004).