BPNN Model Development Steps - Back Propagation Neural Network

Chapter 6 Modeling of FSW Process

6.3 Back Propagation Neural Network

6.3.2 BPNN Model Development Steps

The development procedure for BPNN model consists following steps:

STEP-1: Definition

The architecture of the network is defined in terms of the values for LI, LH and LO. Among the three, LI and LO values are fixed for a particular problem. LH is an integer value and can be varied. Then values of the learning rate (η), the momentum co-efficient (α) and an activation function (logistic sigmoid in the present work) are assigned. Learning rate and momentum coefficient are chosen in the range of 0.05 to 0.95.

STEP-2: Initialization

In this step the weights and biases are initialized. The connection weights Vij and Wjk are randomly initialized between -0.9 to +0.9, with the biases bH=bO=1.

STEP-3: Presentation of the training patterns

The process of network modification by updating its weights, biases and other parameters, if any is known as training. The training may be in sequential mode or batch mode. In sequential mode the network is modified utilizing the training scenarios sent one after another in a sequence. In batch mode which is used for network training in the present work, the whole training data set consisting of a large number of scenarios, is passed through the network and an average error in predictions is determined. Then it is propagated back to update the weights and bias values of the network so that it can predict more accurately.

STEP-4: Normalization of the dataset

To train a neural network model sometimes the raw training data is required to be scaled in which the range of target values is constrained to remain within the range of activation function. In this case logistic sigmoid function is used which is given by,

= (6.1)

TH-1384_10610335

Chapter 6

159

where, a is constant and y is the input to the activation function S. For a BPNN model with above mentioned activation functions, the target value must be within, say, [0.1 to 0.9], instead of the usual range [0, 1] as it maps the range of y to [-

∞,∞]. It prevents the algorithm from driving some of the connection weights to infinity and thus slowing down the training process. The range of each input is linearly scaled to the range of the activation function. This allows the connection weights to have the same magnitude. So, the input and output parameters for all the datasets are normalized in the range of 0.1 to 0.9 as follows:

= 0.1 + 0.8

– (6.2) where, ymax and ymin are the maximum and minimum values of any parameter in the dataset under consideration, y is the actual value of the parameter and ynorm is the corresponding normalized value.

STEP-5: Forward pass

After applying the activation function to the weighted sum from a neuron the output from it can be calculated and this becomes the input for the next layer. The same process can be carried out for each hidden node and then output node. The output from the j^thhidden neuron and k^th output neuron for p^thpattern is calculated using the following equations:

_!^"# =

$"%∑^-._/0'₍)^*+_,1 (6.3) ₂₃^" # =_$"%∑ ₃

(,* )^*+₄

-,(/0 1 (6.4) where,_!^"#is the output of j^th hidden neuron for p^th pattern at the n^th iteration, ₂₃^" # is the output of k^thoutput neuron for p^thpattern at the n^th iteration, Vij is the connection weight between the i^th input neuron and the j^th hidden neuron at the n^thiteration, Wjk is the connection weight between the j^th hidden neuron and the k^th output neuron at the n^thiteration, 5₆^" is the value of the i^thinput neuron for the p^th pattern.

TH-1384_10610335

Modeling

160

STEP-6: Error calculation

By comparing the calculated output from the k^th output neuron ₂₃^" #, with the target output 7₂^" for p^thpattern at the n^thiteration, the error can be calculated by following equation:

8₂^"# =₉:₂₃^" # − 7₂^"<⁹ (6.5) STEP-7: Setting the stopping criteria for training

The aim of training of the BPNN model is to reach a point where mean square error (MSE) for all the training patterns declines to a sufficiently small value. The MSE can be determined by,

=8# =_9>?

4∑^>_"@∑ ^?_2@⁴ ₂₃^" # − 7₂^"⁹ (6.6) where, MSE(n) is the MSE at the n^th iteration and P is the total number of training patterns. Generally the back propagation learning algorithm cannot be shown to converge under all conditions and there are no well-defined criteria for stopping the network training [Haykin, 2004]. The problem of over-fitting or over-learning of the network is often observed which may be due to the presence of noise in the training dataset. To record the trend of over-fitting MSE for the testing patterns is also determined after each iteration of training. Then training is stopped when MSE for testing dataset keeps on increasing even though it decreases training dataset.

STEP-8: Backward pass

The updatation of all the connection weights is as per the gradient descent algorithm by using the following equations:

∆B₆# = 1

2DE₃F F G

?₄ 2@

7₂− ₂₃^" # ₂₃^" # 1 − ₂₃^" # _!^"#

1 − _!^"# B₂# − 15₆+ HI∆B₂# − 1J (6.7)

TH-1384_10610335

Chapter 6

161

∆K₂# = 1

2DE₃F F G

?₄ 2@

7₂− ₂₃^" # ₂₃^" # 1 − ₂₃^" # _!^"#

1 − _!^"# + HI∆B₂# − 1J (6.8) where, η and α are learning rate and momentum coefficient, respectively.

STEP-9: Iteration

Until the stopping criterion for learning is reached Steps 4 to 7 are repeated. The flowchart for the back propagation learning algorithm is shown below in Fig. 6.2.

Figure 6.2 Flowchart of BPNN training algorithm

TH-1384_10610335

Modeling

162

Dalam dokumen Experimental Investigation Modeling and Optimization of Friction stir Welding Process (Halaman 191-195)