9
new general machine learning tool based on the structural risk minimization principle. It has received much consideration in recent times due to its high accuracy and good generalization capabilities. The use of SVM and its extension for the machine fault diagnosis were summarized in a review paper (Widodo and Yang, 2007).
In the review paper by Widodo and Yang (2007), they discussed how the SVM has gained popularity in the machine learning. They indicated that though numerous methods had been developed based on intelligent systems such as the ANN, fuzzy expert system, condition- based reasoning, and random forest. However, the use of SVM for the machine condition monitoring and fault diagnosis is still rare. The SVM has excellent performance in generalization so it can produce high accuracy in the classification for machine condition monitoring and diagnosis. Based on literatures until 2006, they concluded that the SVM in the machine condition monitoring and diagnosis field was tending to develop towards expertise- orientated and problem-oriented domain. Finally recommended, the ability to continually change and obtain a new novel idea for machine condition monitoring and diagnosis using SVM would be future works.
The above discussion has been related to different machine fault procedures and their trends based on available literatures. At the end the discussion is concluded with the use of the SVM in recent scenario for the fault detection and diagnostics. Though the SVM method has been used in many situations, however, still there is much more scope to explore its use for machine faults diagnostics.
representation of the fault in machine, the SVM recognizes these patterns. Usually, each fault produces special features that can be considered as patterns. So the main task of SVM here is to recognize and classify these patterns as accurate as possible related to faults. The SVM is motivated to represent patterns in a high dimension typically much higher than the original feature space. With an appropriate nonlinear mapping using a kernel function to a sufficiently high dimension, data from two or more categories can always be separated by a hyperplane (Duda, 2001).
The primary desire of combining neural networks with fuzzy systems is to extend knowledge based expert systems by an adaptive component which realizes learning like capability. The learning component has to facilitate the bi-directional knowledge exchange between the neural network and fuzzy system in order to enable a permanent adaptation and tuning of the rule base with less effort (Ayoubi and Isermann, 1997).
For a good classification, data pre-processing is a very important step. A good data pre- processing will reduce the noise in the data and retains as much information as possible (Bishop, 1995). From a clean data, features (condition indicators) can be calculated and considered as patterns for fault diagnosis purposes. In the classification process for the fault diagnosis, when the number of objects in the training set is too small for the number of feature used, most of classification procedures cannot find good classification boundaries. This is called curse of dimensionality (Duda, 2001). Moreover, by a good pre-processing, the number of features per object can be reduced such that the classification problem can be solved satisfactorily. The fault diagnosis routine that uses data represented as feature is called feature-based diagnostic. Recently, the feature-based diagnostic procedure has been employed for the fault diagnosis of machine. Researchers who used features method for the fault diagnosis are reported now.
11
Tax et al. (1999) employed the feature-based procedure from the power spectrum, envelope spectrum, autoregressive modelling, music spectrum and classical spectrum for the failure detection of a small submersible pump. They tried to find the best representation of data features such that the target class can best be distinguished from the outlier class. The support vector data description (SVDD) was proposed to accomplish their work for finding the smallest sphere containing all target data.
Other authors used statistical features based on moments, cumulants and other statistical features of the time data series and the spectral of vibration data for the fault detection are reported (Jack and Nandi, 2002; Samanta et al., 2003; Samanta, 2004; Xu and Wang, 2005;
Ren et al., 2005; Sugumaran et al., 2007; Hu et al., 2007). Yang et al. (2000, 2004, 2005, 2007) used statistical features of time and frequency domain for the fault detection in the rotating machinery and cavitations of the butterfly valve. In the case of induction motor, they acquired data of the vibration and the stator current signal. Yuan and Chu (2006, 2007) performed the fault diagnosis of the turbo-pump rotor using data features that were acquired from frequency bands of the secondary vibration signal. The frequency of secondary signal was divided into 9 bands then the frequency amplitudes on each band and their average value were calculated as features. Sun et al. (2004) employed statistical features, which comes from the acoustic emission signal for the wear detection in the machine tool. They also used the cutting parameter such as the cutting speed, depth of cut and feed rate as additional features.
Cho et al. (2005) carried out the tool break detection using features from cutting forces and power consumption in the end milling machine. The other application was reported by Han et al. (2004), and they conducted the hot spot detection in the power plant using features from the data temperature, which were acquired by thermocouple and infrared sensors. Moreover, Ramesh et al. (2003) conducted a prediction of the thermal error in machine tools using features from temperature sensors.
In the feature-based diagnosis process, after defining features (statistical features) from original data, a huge dimensionality problem of features occurs. It cannot be avoided because of not all features are useful and optimal for the classification process. The existence of irrelative features tends to degrade the performance of classifier. One of solutions to resolve this problem is by performing the feature extraction, which can extract optimal features and all at once reduce the dimensionality of features. Basically, the feature extraction means mapping process of data from higher dimension into low dimension space. Many methods have been proposed to perform dimensionality reduction using the linear and nonlinear techniques as reported by Fodor (2002). In the machine condition monitoring and fault diagnosis research area, the feature extraction using the component analysis was reported as follows by using linear methods: the principal component analysis (PCA) (Widodo et al., 2006, 2007; and Yuan and Chu, 2007), and using independent component analysis (ICA) (Widodo et al., 2006, 2007). Moreover, nonlinear feature extraction using the kernel PCA and kernel ICA was also performed by Widodo and Yang (2007). The other techniques called rough set theory (RST) was conducted for extracting optimal features and reduce dimension of features by Xu et al. (2005) and Fang (2006). In their research, RST was employed to pre- process the data for eliminating redundant information and reducing the sample dimension.
Moreover, some experts suggested use of the feature selection after defining features set from the original data. The techniques, which are addressed to the feature selection, are the genetic algorithm (GA) and the distance evaluation technique (DET). In machine fault diagnostics area, researchers who employed the GA technique were Jack and Nandi (2002), Samanta et al. (2003, 2004), and Li et al. (2005). In addition, Hu et al. (2007) and Yang et al. (2004, 2005) were reported successful performances of feature selections using the DET.
13
From aforementioned discussions, it can be observed that the SVM as the feature-based diagnosis method have been widely used in many applications of machine condition monitoring and fault diagnosis. Most of results from the feature-based technique have been relatively satisfactory according to papers reviewed. It means that the feature-based procedure is a recommended method when the recognition and classification process are performed. The research is still continuing on different machine components for its faults diagnosis using the SVM. On consideration of the focus of the present thesis, the review of literatures is concentrated only on the gearbox fault diagnosis in the following section.