Investigating Low-complexity Architectural Issues under UBSS

Our project goal is to develop a real-time chip to process the sensor signals and separate the source signals, which is used in healthcare such as autism. In real time UBSS or SCICA problem we need a digital chip that will separate the sources in real time. So we need a chip that is high-speed so that it will be suitable for real-time applications and it must also be reconfigurable so that it can work for different types of applications where the frame length of signals is different.

Thus, we proposed a high-speed and reconfigurable Discrete Hilbert Transform architecture design methodology targeting the real-time applications including cyber-physical systems, Internet of Things or Remote Health Monitoring where the same chipset needs to be used for multiple purposes under real-time scenario. In addition, the DHT must be reconfigurable and high speed so that it is suitable for real-time problems. Additionally, in the emerging fields including cyber-physical systems, internet of things, remote health monitoring applications, there is a need for separation of signals from the composite in such a way that it meets real-time requirements without significant burden on available resources. farm.

Recently, a systolic array based reconfigurable architecture was proposed in [8] , but this is achieved at the cost of high processing time there making it unsuitable for real-time applications. This motivates us to propose a high-speed and reconfigurable DHT architecture design methodology mainly targeting the real-time applications where the same chipset can be used for different purposes depending on different applications. But the 3D-SCICA is not useful for ND-SCICA, which is useful in real-time application.

But the main challenge here is to design an architecture for DHT which should be reconfigurable for different frame lengths and should also be high-speed so that it can be useful for real-time applications like biomedical signal processing.

Proposed Methodology

3.1) and (3.2) we can formulate a generalized formula for M-point (even number of samples/points) as follows. In other words, the physical explanation would be that N (which is a multiple of 4 as shown in (3.4) and will be discussed in detail in Section IV) is a chip parameter known as the core that the designer can set during chip design. On the other hand, M may vary according to different applications, but it can be realized using the same chip with a fixed N, thus achieving reconfiguration and high speed according to our proposed methodology.

For example, considering the N=8-point (multiple of 4) kernel, which is fixed on a chip, that can be used to implement the M=512-point UBSS system for Speech Processing application, one can also used for UBSS with M=4096 points. for medical applications using the same chip. It can be observed that the kernel is the multiplication of two matrices of order N ×N and N ×1 which gives the matrix of order N ×1. So, from (3.7) we can conclude that, to calculate DHT for given samples M using a fixed kernel that multiplies two matrices of order N×N and N×1 and gives a matrix of order N×1 for MN2.

So, for given M-samples, we need to appropriately reuse our only resource, the N-sample kernel. In the matrix K, in (3.5), since each row (except the first) is one element right-circular displacement of the previous row, as shown in Figure 1, all elements along the axis parallel to the main diagonal are equal and the elements of the alternate axes there are zeros along the diagonal axis. The array contains a total of (M/4)×4 = M elements, all of which are the first elements of the alternate axes that are parallel to the main diagonal, starting from the upper right side of the K matrix to the lower left side.

Similarly we can generate elements for the submatrices in (3.6) , which will have N elements in each submatrix Ki of (M/N) matrices. So for M-sample DHT using N-sample kernel, we need to generate all elements for sub-matrices Ki in (3.6) from M/4 constants, si. We can generate the Ki matrices using Kseti, since Kseti is the set of elements that are the first elements of all axes parallel to the main diagonal, except for alternate axes that have zero as elements.

Now for K1 and K2 we need to find the set of elements, Kset as in (11), which are the first elements of the alternate axes that are parallel to the main diagonal in the matrix K in (3.11). We can observe Kset1, Kset2in (3.14) as the set of elements that are the first elements of alternate axes parallel to the main diagonal axis, form the upper right side to the lower left side, of the matrices K1, K2 as in (3.15) and (3.16). ). It is worth noting that in this thesis our thrust is on High-Speed On-Chip Reconfigurable DHT Architecture Design Methodology for Real-Time Signal Processing, therefore the calculations related to the inverse DHT and the corresponding inverse matrix are out .

Figure 3.1: Order of elements in K-Matrix for M=8

Results and Discussions

We synthesized the proposed architecture for N=4 and Mmax = 1024, using Cadence RTL compiler UMC technology 90nm at 1MHz frequency for illustration purpose. However, it can be noted that the same architecture can be synthesized under different technology libraries with different frequencies on any hardware or embedded platform. We also compared the speedup in terms of number of clocks with [8] and we are achieving twice the speedup of the latest systolic cluster based architecture which is better or comparable to its state of the art techniques as mentioned in the section. -I, which requires the kernel for 2× MN2.

Note that the number of hours shown in Figure 3(a) indicates the time required for completion. Since different architectures can have different number of calculations per clock cycle, we therefore considered the number of clocks to calculate an M-point DHT calculation instead of the individual calculation required in the DHT process.

Figure 3.3: (a)Comparison of processing speed of the proposed architecture with the state-of-the art architecture [8]

Conclusion for DHT

ND-SCICA is an extreme case of UBSS where the number of mixing signals is only one to separate or find the N number of sources. In real-time applications such as protein spectral analysis, we need a digital chip that works in real-time scenarios. In our proposed architecture of ND-SCICA, we need an ND-FastICA block that can work for FastICA with different sizes, that is, ND-FastICA can be dynamically reconfigurable for different numbers of signals.

In the architectural design of the ND-SCICA problem, the main challenge we face is in the design of fpica block. Because in the process of ND-SCICA we will get different number of signals as input to the fastica block. So FastICA block should be able to reconfigure according to the number of input signals.

Therefore, we propose the reconfigurable ND-FastICA block based on COrdinate Rotation DIGital Computer (CORDIC), using the idea of static ND-FastICA proposed by Amit Acharyya et.

ND-FastICA

Our challenge is therefore to design an architecture that can be reconfigured so that it can solve equations 4.5 and 4.6 for different values of N and L. Find the N −1 θ terms for the vector taken in step 1 (for the first iteration) or from step 6 (for the second iteration onwards). We need to check the estimator vector w(:, i) in step 6 against the vector used in the previous iteration.

So from the above steps we can conclude that for a single iteration we need to use the total of (N −1)×(L+ 1) times RotationMode and (N −1) times Vectormode Cordic. The first is the high-speed reconfigurable discrete Hilbert transform used in the UBSS problem where the number of sensors is less than the number of sources. And the other is the reconfigurable CORDIC-based FastICA algorithm used in the UBSS problem where the number of sensor signals is only one, which is also called the SCICA problem.

3] Shengli Xie; Liu Yang; Jun-Mei Yang; Guoxu Zhou; Yong Xiang, "Time-Frequency Approach for Underdetermined Blind Source Allocation", Neural Networks and Learning Systems, IEEE Transactions on , vol.23, no.2, pp.306- 316, February 2012. 6] Ching-Tai Ng, "On the Selection of Advanced Signal Processing Techniques for Guided Wave Damage Identification Using a Statistical Approach", Engineering Structures, vol.67, pp.50-60, May 2014. 8] Li Liu; Yan Zhang, "Design and implementation of reconfigurable discrete Hilbert transform based on systolic arrays", Mobile Communications and Computing (CMC), 2010 International Conference on , vol.1, pp.245-249, April 2010.

C., "Design of efficient digital FIR differentiators and hilbert transformers for mid-band frequency ranges", International Journal of Circuit Theory and Applications, vol.17, no.4, p. 12] Mukhopadhyay, S.; Bhattacharya, P.; Bhattacharjee, R.; Bose, P.K., "Discrete Hilbert Transform as a Minimum Phase Type Filter for Wind Speed Prediction and Characterization", Communications, Devices and Intelligent Systems (CODIS), 2012 International Conference on , pp.333-336, December.