Development of Machine Learning Toolbox - Machine Learning Toolbox and PCA Visualization for D

differently in step 2. After the data loading step, the user can plot the raw data by clicking the 'Time Plot' button. In the feature extraction step, hand-crafted features are selected based on expert knowledge. Information about the load data is typed such as sampling rate, rpm and sample size in this step. If the data is too high-dimensional, engineers can reduce the dimensionality of data as much as they want in step 4 via PCA or Fisher discriminant analysis. In step 5, the classification model is trained by classification algorithms. If the learning form is unsupervised learning, a clustering algorithm is applied before the classification algorithm. After the entire steps are conducted, the generated model is saved for the real-time classification later.

Figure 21: GUI software

Data Type Selection

Machine learning algorithms will be applied differently depending on the data type. It can be categorized as supervised or unsupervised [18]. In the first step of the machine learning toolbox,

there are the 'supervised' and 'unsupervised' buttons for selecting data types as you can see in Figure 22. The user can change the learning form depending on the data type which the user has.

Figure 22: Datatype selection buttons in GUI

In supervised learning, a category label for each pattern is provided in a training set, making a training model use this category label. In unsupervised learning, there is no category label. First, the system forms clusters or groups of the input data and trains the model [1]. Therefore, as shown in Figure 23, when the ‘supervised’ button is pressed, the button of the clustering part is disabled.

On the other hand, when the ‘unsupervised’ button is pressed, the classification part is deactivated, as shown in Figure 24.

Figure 23: Results when the ‘supervised’ button is pressed

Figure 24: Results when the ‘unsupervised’ button is pressed

Feature Extraction

Feature extraction is to transform raw data into features that can be used as an input for a machine learning algorithm. In most mechanical systems, the features consist of time domain features and frequency domain features. The sampling rate is required to extract the frequency feature. In the case of rotating machinery, the RPM should also be entered. The data is extracted by a sample size and feature are calculated according to the selected feature algorithm. One or more original features can be selected by users. When signals through multiple channels are received, feature extraction algorithm is applied to each channel. After selecting the desired features and pressing the

‘Extraction’ button, the extracted feature values are displayed on the graph (Figure 25). If the dimension of a selected feature vector is within three dimensions, it will be displayed. However, if it is more than four dimensions, the result cannot be displayed shown in Figure 26.

Figure 25: Result of feature extraction (dimension < 4)

Figure 26: Result of feature extraction (dimension > 3)

When the ‘correlation’ button is pressed, users can see the correlation matrix plot among the selected features. Then they can check the correlation between the features as shown in Figure 27.

It can later be used to reduce highly correlated features.

Figure 27: Correlation Matrix

Dimension Reduction

If the dimension of the extracted features are high and cannot be represented graphically, or if the statistical correlation between the features is large, the dimension of the features can be reduced without losing too much information. If you select a reduction algorithm after entering the dimension to be reduced, the reduction result is displayed as a graph in Figures 28 and 29. The developed GUI software provides two algorithms for dimension reduction: the principal component analysis and Fisher discriminant analysis.

Figure 28: Result of dimension reduction (fisher discriminant analysis)

Figure 29: Result of dimension reduction (principal component analysis)

In addition, when the dimension is reduced through the PCA, the importance of the newly defined coordinates and the coefficient of each feature can be examined (Figure 30).

Figure 30: Result of principal component analysis

Clustering and Classification

In the case of a classification, the process depends on the data type. If the learning form is unsupervised learning, it must go through a clustering step before choosing a classification algorithm. When a clustering algorithm is applied, it automatically divides a group of data and graphs the result according to its learned label. After a group of data is divided, a classification model can be generated by selecting an appropriate classification algorithm. On the other hand, since the data group is already sub-grouped in the case of the supervised learning, the classification model is generated by the classification algorithm without clustering process.

Unsupervised learning data without labels is shown in black as shown in Figures 31 and 32.

Applying the k-means algorithm to each unlabeled data yields the result shown in Figure 31. Data classified in the same group will have the same color.

Figure 31: Result of clustering (K-means)

In the case of SOM, the result is as shown in Figure 32. The purple hexagon is the location of the data, and the distance between the two hexagons is represented by a diamond-shaped hexagon, from yellow to black. The closer to yellow, the similar the data is. The closer you are to black, the more likely it is that the two data are different. You can see the data that surrounds yellow as a group.

Figure 32: Result of clustering (SOM)

In the case of supervised learning type, or after clustering, classification is applied. Figures 33, 34, and 35 show the result of model generation by applying the SVM, neural network, and perceptron algorithms to the same data, respectively. For a perceptron, a decision boundary is represented by a single line. In the case of SVM and neural network, the decision boundary can be identified by checking the grouping of the surrounding data.

For the SVM, the number of support vectors of the trained model is also texted in the GUI as shown in Figure 33. A large value of the support vector number indicates that there is a lot of data in the margin near the decision boundary, which means that it is not a good classification model.

Conversely, if the number of support vectors is small, it can be regarded as a good classification model. Therefore, the number of support vectors can be treated as an indicator of whether classification is performed well. You can use this indicator to see how the performance of the classification model changes as you use other features or other dimension reduction algorithms.

Figure 33: Result of classification (Support Vector Machine)

Figure 34: Result of classification (Neural Network)

Figure 35: Result of classification (perceptron)

Dalam dokumen Machine Learning Toolbox and PCA Visualization for Data-Driven PHM (Halaman 45-60)