Fast Algorithm Development for SVD: Applications in Pattern Matching and Fault Diagnosis

The size of the historical database is 4.32 million measurements (last 8 years of data) for each of the 52 variables. A simplified form of standard similarity PCA is proposed without any compromise in accuracy.

Previous work

Pre-processing of dataset

Computation of PCA

Singular value Decomposition

To improve execution time and for better storage, the zero singular values in the diagonal matrix Σ along with the unnecessary columns of U that multiply by these zeros are removed.

Eigen Value Decompositon

Similarity factors

Standard PCA similarity factor
Modified PCA similarity factor
Distance similarity factor
Dissimilarity factor

The distance similarity factor is defined as the probability that the center of the historical data set is ¯xH at leastφdistance from the snapshot data set. S proposed a Dissimilarity Factor for comparing two data sets. It has been shown that the eigenvectors of the transformed data sets are the same and the corresponding eigenvalues are related as shown below.

Table 1.3: Computation time for Standard PCA similarity factor OpID k G.I:eq(1.4)

Metrics for Pattern Matching

Pool Accuracy

If two data sets are different, the dissimilarity factor D will be close to 1, while it will be close to 0 if they are similar.

Pattern Matching Efficiency

Tennessee Eastman Challenge Process

Introduction

Proposed Pattern matching approach

Selection of window size
Pre-computing of Historical Database
Clustering
Overview of Historical Data

The selected window size should be chosen properly so that it captures the main components well, in order to give a high value of the similarity factor between similar operating conditions. Since the historical database is fixed, all calculations such as pre-processing and SVD remain the same and only the calculation of the similarity factor needs to be done with the image change.

Figure 1.2: Schematic layout of Tennessee Eastman Challenge Process

Results for pattern matching

Methodology for Pattern Matching

An empty cell indicates that it was not detected in any of the 10%, 50%, and 100% error samples. We relax the condition for fault detection and diagnosis that a fault must be correctly classified at least once in one of the scenarios with 10% , 50% and 100% fault samples. No improvement is seen in reducing misclassifications in Op ID's 9 and 19, even with windows as large as 2000.

On ID 20, the fault with the smallest maximum duration of 8 hours was correctly classified in 50% and 100% fault snapshots in all cases and thus there were no problems with the proposed method of not detecting a fault before maximum duration not lead to catastrophic damage. Similarly, the fault IDs are well classified by the time fault is simulated for 4 hours (50% of snapshot) while Fault ID 16 is well classified in the snapshot with fully faulted samples. The accuracy for the proposed fault detection and diagnosis was found to be with snapshot data as a window with fully normal operation samples.

Longer jumps in the window are very dangerous, as closely matched windows have been studied to have a higher similarity factor of around 0.9 for only 50 samples.

Table 1.5: Results for snapshot as 10% fault samples for Instances I2-I7 Op ID I2 I3 I4 I5 I6 I7 Detected

Properties of SVD

Golub-Kahan-Reinsch SVD

Algorithm 1a
Householder Reflections
Givens Rotations
Algorithm 1b
Algorithms for QR decomposition
Bi-diagonalization of Triangular matrix
Diagonalization of Bi-diagonal matrix

Let's take the transpose of the resulting decomposition of the matrix A1, which is the same as the decomposition for A. The Matlab code for computing svd using Algorithm 1a can be found in svd 2a.m. Repeating this process for all columns yields a bidiagonal matrix B, which is used in step 2. Step 3: Reduce the bidiagonal matrix B obtained in step 2 to diagonal form using Givens rotations.

The matlab code for finding QR decomposition using the Gram-Schmidt process can be found in ings qr.m. Household reflectors are applied successively to each column, in such a way that it retains only k elements in the kth column. The triangular matrix R obtained from QR decomposition is made bi-diagonal by arranging an array of householder reflectors on each row in such a way that it retains only k-th and k+ 1-th elements in the k-th row.

The bi-diagonal matrix B obtained in Step-2 of Algorithm 1b is diagonalized using Givens rotation as discussed in Algorithm 1a.

Figure 2.2: Geometrical Interpretation of Householder reflection: H 0

Multiple Relatively Robust Representation (MRRR) algorithm

Divide and conquer algorithm

Let us suppose that every dense matrix of dimension m×n reduces to lower bi-diagonal matrix of size (n+ 1)×n. Then the obtained lower bidiagonal matrix is divided into two submatrices B1 and B2 of dimensions vex(k−1) and (n−k+ 1)×(n−k) respectively. Therefore, for example, the value of k is 3 and the matrix B is divided into B1 and B2 as shown below.

Then, the SVD of these submatrices B1 and B2 is computed using standard algorithms, or these can be further recursively partitioned until they are computationally less intensive compared to the initial dimension. The non-zero block in the middle matrix is called M, which will be used to find the singular values. This matrix B has a special structure that has non-zero entries only along the diagonal and inside.

We found a method to solve the roots of the secular equation by nature-inspired methods such as firefly algorithm as described in [11], but since these metaheuristic methods are time-consuming, they were omitted as they may not be suitable for our time-constrained problem.

Bi-section and Inverse iteration

Roots are generally found using the Newton Raphson method, which requires O(m) operations for each root and thus O(m2) time complexity for an am×m matrix.

Hybrid methods for SVD

Introduction

In this section, we use a historical dataset with snapshot windows (100% error samples) from error operations 1, 2, and 3. We plot the difference between the calculated similarity factor from different hybrid algorithms and the similarity factor obtained using the Maltab svd command, since mostly differ by 2 or 3 decimal places.

Householder Bi-diagonalization followed by Jacobi

QR decomposition followed by Jacobi

Selection of an SVD algorithm for Pattern matching

The bi-diagonal methods of SVD are very accurate but suffer from high latency due to large dense matrix multiplications. Even the Matlab implementation of SVD is known to make use of divide and conquer Bi-diagonal SVD. From inspection of the similarity factors calculated through Matlab, it has been observed that an accuracy up to 3 decimal places is needed for better pattern matching to keep the misclassifications under control.

This problem can also be avoided by proposing a new similarity factor that shows better diversity among all operating conditions, in which case the accuracy of the similarity factor may be slightly reduced. Householder followed by 6 Jacobi checks gave better accuracy up to 4 decimal places, so the algorithm can be chosen when both precision and accuracy are required viz. when the similarity factor reaches a threshold above 0.7–0.8. In the remaining cases, where the similarity factor is below the threshold, the Jacobi algorithm can be chosen as the basic algorithm.

Since it uses simple shift and addition/subtraction operations, the design becomes simpler, reliable and faster without the need of multipliers and lookup table to calculate trigonometric functions.

Figure 2.4: Comparison of various algorithms with snapshot as IDV1

Modes of CORDIC

Rotation mode

For counterclockwise rotation, the rotation matrix can be obtained by replacing θ with −θ, which just shifts the rotation matrix obtained for clockwise, as shown below. A rotation matrix can be decomposed into simple shift and add/subtract operations as shown below. Tan θi present in the above equation can be replaced by the known microrotations with θi = arctan21i as given in Table 3.1.

The terms 21i can be easily implemented in hardware as they correspond to Shift-right-operation(). The scaling factor in equation 3.3 does not need to be calculated at each iteration since the product eventually converges to 0.6037 after a finite number of iterations and thus a final scaling can be done at the Using equations 3.1 and 3.3, rotational outputs for clockwise rotation can be expressed at the moment given by

Vectoring mode

To overcome the problem of limited converging of ±90◦ in CORDIC, an initial 90◦ rotation will be performed to achieve a ±180◦ range. From the quadrant detector shown in Table 3.2, the quadrant information will be precisely stored in a two-bit register quad that is used to transform the input vector.

Pipelined CORDIC

Jacobi methods are quite popular in hardware implementation of SVD because the computation can be grouped in parallel. Various types of systolic ensemble implementations based on Jacobi methods have been reported in the literature for hardware acceleration of SVD. Based on the number of angles required to eliminate off-diagonal elements, Jacobi methods are broadly classified as one-sided and two-sided Jacobi methods.

Classical Jacobi Algorithm

Two-sided Jacobi Method

Shuffling rotations

In the previous section, the elements of the input array are shuffled according to the order scheme (row/parallel) in each sweep iteration. Instead, the rotation matrix can be rotated according to the sequence scheme, leaving the input matrix unchanged.

One-sided Jacobi Method

Systolic array implementation

Row ordering

From the above sequence it can be understood that all (p, q) satisfying the condition p < q are sorted by row and hence the name row sort.

Parallel ordering

After a series of such checks (typically 5-7), the off-diagonal elements denoted by ε become small enough to make the matrix diagonal. One has to be very careful when designing the basic building blocks (functions), because every clock cycle saved will result in a huge saving of clock cycles, as the same block is instantiated millions of times over and over, but with different inputs.

Previous work in Hardware implementations of SVD

Reading of historical window

Pre-processing

Recursive Mean and Standard deviation computation

Two-sided Jacobi algorithm

Sorting

Calculation of similarity factors

Working with Vivado HLS

HLS with precomputed dataset

From the flow chart it can be observed that the design process is cyclical until it meets the user's specifications. Further design and allocation of hardware and software should be done to achieve a better speed, as it can be seen that the speed increased as the number of windows increased and with a large number (lakhs) of executions, it is expected to be satisfactory. . performance compared to the Matlab implementation. The optimal choice of window size and other manually selectable parameters can be rounded to the nearest power of 2 since multiplication and division by these parameters can be done with simple shift operators instead of complex DSP48E multiplier blocks.

A new hybrid method for finding the SVD for bi-diagonal or tri-diagonal matrices that exploits its sparsity should be developed, since dense matrices can be reduced to bi-diagonal or tri-diagonal form with the reflections of the owners of home using simple shift operations. Other data-driven methods, namely Correspondence Analysis and Independent Component Analysis (ICA), all of which use SVD can be applied to evaluate performance. If the data is properly labeled, a type of supervised clustering can be done, ie. samples belonging to the same operation can be grouped into a group.

Once an error detection has been identified with these statistical tests, the confirmation and diagnosis of the same can be achieved using the proposed moving window approach, which now runs through the original non-clustered historical database.