Neural Networks - Numerical Data Mining Models and Financial Applications

Numerical Data Mining Models and Financial Applications

2.4. Neural Networks

Most of the other features of IBL are average in comparison with alternative methods (see Table 1.7). There is one feature of the k-nearest neighbors method that makes it unique.

This is the possibility for an expert to study the behavior of a real case with similar attributes and motivate his/her intuition and informal knowl- edge for the final forecasts. Below some features of IBL based on [Dhar, Stein, 1977] are presented.

Explainability Medium

Response speed Medium-High

Tolerance for noise in data Medium

Tolerance for sparse data Medium

Ease of use logical relations Low Ease of use of numerical data High

These features are needed to compare IBL with probabilistic ILP. Excepting the feature - “ease of use of numerical data”, the probabilistic ILP methods have advantages over IBL. To use numerical data in a probabilistic ILP, the data should be transformed into relational form (see Chapter 4).

where

is a set of input units called the input layer,

is a set internal (hidden) units called the hidden layers and is a set of output units, called the output layer.

The set of weighted links, consists of two components:

where L is a set of links between nodes and W is a set of numeric weights attached to the links. Each link is given as an ordered pair <j,i>, where j and i are indexes of nodes and This pair indicates a directed link from to Similarly, is a number (weight) attached to the link from and

The set oflinear input functions, In, consists of a function for each unit

where is the output of the node which serves as input for node and is the set of all such inputs to the node Each output is called an activa- tion level. This output from is weighted by the The functions, are interpreted as the total weighted input for the unit

An activation level for input nodes, is taken from external (envi- ronment) nodes, which do not belong to the neural network. Usually, a set of these input values is called an example (instance), e. Similarly, the output nodes, deliver values into some external nodes, which also do not belong to the neural network.

The non-linear activation function, g, converts the weighted input into the final value that serves as unit’s activation value, The step and sig- moid functions are typical activation functions:

The parameter t in the step function, is a biologically motivated threshold.

If a node is interpreted as a neuron t represents the minimum total weighted input needed to fire the neuron.

In these terms, a neural network can be viewed as a set of complete processing nodes, where each node consists of three components:

In this way, information processing is encapsulated in the nodes. Each node uses only local inputs from its neighbors and therefore each node is inde- pendent of the rest of the network. This processing node can compute and recompute an activation level many times.

Figure 2.1 presents an example of a neural network used for forecasting financial time series. This example is considered in detail in Section 2.6.

Figure 2. 1. Recurrent Neural Network for financial time series forecasting

2.4.2. Steps

Each processing unit in neural network performs a simple computation.

Based on the input signals from its input links, the unit computes a new activation level for its output link. The major steps of the generic neural network learning method are presented in Table 2.2.

Table 2.3 shows the mechanism for implementing the most important

step (#6) of adjusting weights [Russel, Norvig, 1995]. This mechanism of updating a multilayer neural network is called the backpropagation method. The method has a tuning parameter called the learning rate.

2.4.3. Recurrent networks

A neural network can be designed with or without loops. Neural networks without loops are called feedforward networks. Recurrent neural networks (RNN) [Elman, 1991] are artificial neural networks with loops.

They use outputs of network units at time t as the input to other units at time Specifically, a recurrent network can be constructed from a feedforward network by adding:

– a new unit b to the hidden layer, and – a new input unit c(t).

The value of c(t) is defined as the value of unit b at time i.e.,

In this structure, b depends on both the original input x(t) and on the added input c(t). Therefore, it is possible for b to summarize information from ear- lier values of x that are arbitrarily distant in time [Mitchell, 1997].

Mitchell also pointed out that these recurrent networks have an important interpretation in financial time series forecasting. Let be the stock price on date This stock price should be predicted using some economic indicator x(t) on the date t and values of x on previous days arbitrarily distant in time. The RNN can be used for developing such forecasting model.

Alternative recurrent network structures can be designed to express more complex recurrent relations, e.g., adding new hidden layers between the input and unit b. Methods of training recurrent networks are described in [Mozer, 1995]. Despite the relative complexity of recurrent networks, they have unique advantages in representing distant relations in time series. In

section 2.6, we review some financial applications of recurrent neural networks.

2.4.4. Dynamically modifying network structure

In sections 2.1-2.3, we assumed a fixed static network structure, i.e., fixed numbers and types of network units and interconnections. Selection of the structure of a neural network is a most informal, expert-dependent task, but a neural network’s performance, generality, accuracy, and training efficiency depend heavily on this structure.

The dynamically modifying network approach tries to tune a network structure in two opposite ways [Mitchell, 1997]:

1. begin with a network containing no hidden units (perceptron), then grow the network by adding hidden units until the training error is re- duced to some acceptable level, or

2. begin with a complex network and prune it as certain connections are found to be nonessential.

According to Mitchell [1997], in general, techniques for dynamically modifying network structure have met with mixed success. From our view- point, this mixed success can be “credited” partially to the search methods used in the space of “black box” neural networks. In this space, the majority of searched network structures can be irrelevant. And only after a network is found, can it be transformed into a meaningful “IF-THEN” rule form. We consider this matter in later chapters.

Dalam dokumen DATA MINING IN FINANCE (Halaman 50-54)