• Tidak ada hasil yang ditemukan

Configuring the RNN for MIMO Channel Model- inging

Chapter 1 Introduction

4.2 Configuring the RNN for MIMO Channel Model- inging

while modeling MIMO channels, which, to the best of our knowledge, has not been re- ported in the literature till now. Although such an application has the advantage of certain diversity gain to support high data rate transmissions, it also poses the architec- tural challenge of cooperative learning of the blocks. We take care of this challenge with a framework supporting expendability of block level CTDFRNN units with responses trimmed by SOM optimization with reduced time delays and BER values. Multiple CTDFRNN units trained in a cooperative environment improves learning and the ability to tackle inputs with time-variation. An attempt is made here to show that the RNN in CTDFRNN form with SOM optimization can be configured to provide certain new possibilities. CTDFRNN clusters optimized with SOM with diversity gains is found to be useful for low signal to noise ratio (SNR) environments. Further, the Modular Network SOM (MNSOM), which is considered to resemble closely the biological computation with an inherent reinforced modular learning, is also proposed and formulated using CTD- FRNN blocks for application in MIMO channel estimation. The results, when applied to a wireless VOIP platform, show improvement yet provide the scope for further ex- ploration for devising innovative methods of MIMO channel estimation. It is also found that CTDFRNN-MNSOM emerges out to be a viable alternative to the conventional stochastic estimation methods with an average 60% saving of processing time.

4.2 Configuring the RNN for MIMO Channel Model- ing

The constraints associated with suitably modified temporal-MLP structures as explained in Chapter 3 necessitate exploring the RNN as an option to model time-varying MIMO channels. The RNN maybe considered to be a MLP having one or more local or global feedback in a variety of forms [34]. Figure 4.1 shows an RNN-block known as Time Delay Fully Recurrent Neural Network (TDFRNN). The present value of the model input is denoted bys(n), and the corresponding value of the model output is denoted byy(n+ 1);

Figure 4.1: Time Delay Fully Recurrent Neural Network

that is, the output is ahead of the input by one time unit [34]. The dynamic behavior of the the RNN model is described by

y(n+ 1) =F(y(n),· · · , y(n−p+ 1), s(n),· · · , s(n−p+ 1)) (4.1) where F(.) is a nonlinear function of its arguments. The use of the feedback loop in an RNN enhances its ability to learn over the given period of time by connecting contextual knowledge with inputs and responses which are circulated among adaptively updating weighted links and stored between layers. This leads to an accumulation of knowledge which, due to the multiple delays, generates a set-up for learning, comparing and retain- ing relevant information only. Moreover, a subtractive process rejects common data and holds back only relevant information, hence correlation among circulating samples inside the RNN layers are very less.

In the present work we have adopted the split-activation Complex TDFRNN (CTD- FRNN) architecture with individual real and imaginary inputs fed to segregated blocks.

The TDFRNN converted to CTDFRNN in several versions proposed here contributes to- wards better learning, faster convergence and lower error margins. The lowering of error values is explained in Section 4.3.1. The basic algorithms for RNN training are [34] Back Propagation Through Time (BPTT), Real Time Recurrent Learning (RTRL), Decoupled Extended Kalman Filtering (DEKF) [84] and Complex Valued RTRL (CVRTRL). All these algorithms are used in the learning phase of the TDFRNN blocks and the one that gives the best training time performance is used with the CTDFRNN blocks that consti- tute the MIMO channel estimation system. The details of the RNN and related training

methods have been included in Section 2.3.2.

A TDFRNN block (Figure 4.1) has a few distinct layers consisting of the external input- feedback, an output and a hidden layer each having recurrent artificial neurons as pro- cessing elements. Taking such a network as reference, letyl(k)denote the complex valued output of each neuron, l = 1, · · ·, N at time index k and s(k) be the (1×p) external complex-valued input vector. The overall input to the network I(k) represents the con- catenation of vectors y(k), s(k) and the bias input (1 +j). A complex valued weight matrix of the network is denoted by W, where for the ith neuron, its weights form a (p+G+ 1)×l dimensional weight vector wl=[wl,1,···,wl,p+G+1]T where G is the number of feedback connections. The feedback connections represent the delayed output signals of the TDFRNN. The output of each neuron can be expressed as

yl(k) =φ(Ψl(k)), l = 1,· · · , N. (4.2) where

Ψl(k) =

p+N+1

X

j=1

wl,j(k)Ij(k) (4.3)

is the net input to lth node at time index k and φ is a complex nonlinear activation function [85]. The RNNs have the capacity to deal with time varying inputs due to the presence of atleast one feed-backward loop despite it being essentially a feedforward structure [34]. The feedback in singular or multiple delayed forms from the output to the hidden layer and then to the input layer store information for one time step at a time which cumulates over the given time cycles. After the cumulation process in the hidden layer is over, the information propagates to the input layer such that at each step dynamic updating of the synaptic weights continues as per the norms of the adopted method of learning or training. This backward transmission of information enables the input, hidden and output layer neurons to learn over an extended period of time by obtaining records of prior activation or responses and share the updates with the synaptic links and the recurrent units constituting the processing system. Moreover, the RNN architecture, by using the state vector relates the present and the past context and transitions in-between the layers to describe future response of the system thereby dynamically learning tempo- ral behaviour of input samples [34].

Signals encountered in wireless communication demonstrate variations in terms of their complex valued nature. Application of ANNs for channel estimation therefore must take into account how this aspect is tackled. Learning and nature of activation functions of ANNs must have the ability to deal with signals that have strong coupling between magnitude and phase components of signals. One way to ensure that ANNs learn magni- tude and phase components separately is to use split-complex activation functions. The

practice is to provide real and imaginary components in splitted form to RNN blocks and allow them to carry on the training as mutually exclusive events [107]. However, for cases where in-phase and quadrature components are tightly coupled the split-complex activation approach yields poor performance. Communication requirements necessitate that the signals be tightly coupled. Therefore, channel estimation set-up with split RNNs trained to handle real and complex components separately shall demonstrate lower per- formance [107]. Two contrasting pictures, hence, emerge: on one hand, the real and complex components in split form aid the RNN learning and on the other, such RNN set-up cannot provide satisfactory performance for signals where real and imaginary com- ponents are tightly coupled. This trade off between the learning and activation functions of RNNs optimized for actual applications requiring real and imaginary components sep- arately and the necessity of better performance for signals with tightly coupled in-phase and quadrature components needs to be sorted out. An attempt is made here to for- mulate a few RNN structures designed to reduce this trade off. Split-complex activation is adopted to allow the RNNs to learn the real and imaginary parts separately and the output of these exclusive blocks are combined to obtain a time average (Figure 4.2) for the inputs applied between a defined time window. By allowing the RNN blocks to train with real and imaginary components as exclusive events, the error surfaces are aligned appropriately to form classification boundaries for segregation of inputs from composite segments. This at times though is undesirable but reduces computational complexity.

The RNN in this configuration can be termed as CTDRNN with separate sub-sections dealing with in-phase and quadrature components. The output from the real and imagi- nary blocks are combined and optimized using the time-averaging. The results obtained are better than that obtained with temporal-MLP (Section 3.4). However, the results can be further improved if time averaging is replaced by SOM based optimization. The explo- ration of the RNN in CTDRNN form configured in a cluster provides further opportunity to improve results and provides certain diversity gains. The RNN in CTDRNN form is also arranged to constitute a MNSOM which provides another direction to improve the performance of the system.