Training neural networks - Neural networks

Chapter 2: Literature overview

2.3 Neural networks

2.3.4 Training neural networks

Training a neural network is the process in which the network’s weights are adjusted using a chosen training algorithm. The algorithm makes use of certain learning rules to adjust the weights according to the training data provided. The training data usually consists of example patterns representing the specific task the network is being trained to perform. The weights can be updated in an iterative manner to improve the performance of the neural network i.e. the network can be trained again and again. The ability to learn without a specified set of rules makes neural networks superior to conventional expert systems [10].

Training a neural network successfully involves understanding the environment in which it is to operate. A model of this environment can then be used to discover what information is available to the neural network. This model is called a learning paradigm. The different learning paradigms are as follow [7], [10]:

 Supervised learning – the network is supplied with example patterns of correct behaviour in the form of input-output pairs and learns to reproduce the relationship present in the data. An explicit teacher is present.

 Unsupervised learning – the desired outputs are not available during training and the network learns through self-organizing behaviour using competitive learning rules or a pre-defined cost function. The network learns to extract features or patterns from input data with no feedback from the environment about desired outputs.

 Reinforcement learning – a computational agent interacts with its environment and learns correct actions from a reinforcement signal grading its performance. The aim is to discover a policy for selecting actions that minimize an estimated cost function.

There are three major practical issues when learning from example patterns: capacity, sample complexity and computational complexity. First of all, the issue of capacity must be addressed. It concerns determining how many examples can be stored and what exactly a specific network is capable of – what functions it can approximate or decision boundaries it can form [10].

The second issue is sample complexity and considers the number of examples necessary to obtain a network which generalizes well. Computational complexity is the third issue and refers to the time taken to properly train the network [10]. Addressing these issues can be traced back directly to the choices in the neural network design process discussed earlier.

2.3.4.1 Training algorithms

A training algorithm or learning rule is a procedure used to adjust the weights of a network i.e. train the network. The purpose of training is to improve the performance of the network as measured by some performance or error function. This procedure can be applied to one pattern at a time or a whole batch of patterns at once. These two training styles are called incremental and batch training respectively [9].

A multitude of training algorithms abound and an obvious problem is choosing the appropriate one for the task at hand. This multitude of training algorithms is due to the fact that original algorithms have been modified, extended or combined into various different forms to achieve whatever tasks they were intended for. Certain algorithms are appropriate only for certain network types and the choice of algorithm becomes simpler once a network architecture has been chosen [9].

Table 2-2 provides a summary of network architectures and which training algorithms or learning rules are commonly used to train each of them. The summary also shows which purpose each network was initially intended for. As with the multitude of algorithms, network architectures have also been modified over time and resulted in various different forms. For this reason, the following summary has been restricted to the original algorithms and architectures.

Note that BP in the summary indicates the backpropagation algorithm – an extremely popular training algorithm. The second stage of the BP algorithm in which the weights are adjusted can be accomplished using a variety of different optimization schemes including conjugate gradient methods as well as evolutionary programming. BP in the summary refers to all of these variations as well as standard gradient descent. Hagan, Demuth and Beale [9] provided the information for this summary.

Table 2-2: Summary of neural networks

Category Name Algorithm Purpose Behaviour Paradigm

Feed- forward

neural networks

Linear

Perceptron Perceptron rule Pattern recognition Static Supervised Linear layer Widrow-Hoff

rule

Pattern recognition and function approximation

Static Supervised

Adaline Static Supervised

Multi-layer perceptron

Pattern recognition and

function approximation Static Supervised

Wavelet Function approximation Static Supervised

Time delay

Linear with tapped delay

Widrow-Hoff

rule Digital signal processing Dynamic Supervised Focused

time-delay BP Time-series prediction Dynamic Supervised Distributed

time-delay BP Pattern recognition Dynamic Supervised

Radial basis function

neural networks

Generalized regression

Orthogonal least squares / kernel methods

Function approximation Static Supervised

Probabilistic Pattern recognition Static Supervised

Simple radial basis Pattern recognition and

function approximation Static Supervised

Associative neural networks

Hopfield Hebb rule Pattern recognition and

optimization Dynamic Unsupervised Simple recall Outstar

learning Pattern recall Static Unsupervised Simple associative Hebbian

learning Pattern association Static Unsupervised

Recurrent neural networks

Layered-recurrent

Filtering and modelling

applications Dynamic Supervised Nonlinear

autoregressive with exogenous input

Modelling of nonlinear

dynamic systems Dynamic Supervised Real-time recurrent BP Real-time applications Dynamic Supervised

Elman BP Recognition and generation of

spatial and temporal Patterns Dynamic Supervised

Competitive neural networks

Kohonen self- organizing map

Kohonen (instar) learning

Pattern recognition Static Unsupervised Competitive layer Kohonen

learning Pattern recognition Dynamic Unsupervised

Hamming Dynamic Unsupervised

Grossberg Continuous- time instar learning

Clustering Dynamic Unsupervised

Adaptive resonance

theory Instar / outstar

learning Pattern recognition Dynamic Unsupervised Learning vector

quantization

Kohonen (instar) learning

Pattern recognition Static Hybrid

Dalam dokumen Adaptive control of an active magnetic bearing flywheel system using neural networks (Halaman 41-44)