• Tidak ada hasil yang ditemukan

GP Evolved Intervals (GPEI)

Dalam dokumen Data classification using genetic programming. (Halaman 140-144)

6.2 Proposed Discretisation Methods for GP

6.2.2 GP Evolved Intervals (GPEI)

The GPEI approach allows the GP algorithm to randomly create intervals based on the arity of each node. In this approach, the GP algorithm is allowed to create intervals of any size for a particular attribute that must adhere to the following:

• The lower bound of the first interval has to be the same value as the minimum value for the attribute.

• The upper bound of the last interval must be equal to the maximum value of the attribute.

• The intervals should be disjoint from each other.

• There should be no gap between intervals, i.e. the intervals are disjoint from each other but represent a continuous flow between the values of adjacent intervals and no discontinuity should exist between adjacent intervals.

Provided that the algorithm follows the above mentioned rules, it is free to create any cut-off point for an interval. These cut-off points are selected randomly during the execution of the GP algorithm when a node is being created, i.e. when a node is being added to the decision tree. Nodes are created during initial population generation, and when the mutation operator is applied. When a node is created, the intervals for that node are also created. Once a node has been created along with its intervals, these values can be changed using the alter interval GO which is described below.

An example of an interval created using GPEI is shown in figure 6.3.

Figure 6.3: Intervals created using GPEI.

The pseudocode for the GPEI approach is presented in algorithm 6.1. Two general observations about this approach are made below:

• When the GP algorithm is executed several times, the intervals for nodes representing the same attribute may be different.

• If a decision tree contains several nodes for a particular attribute, the intervals may be different for each of them.

In step 2 of the GPEI pseudo-code the reason for selecting arity values between 2 and 4 is due to the fact that a value greater than 4 would render the GP program search space large and would affect the algorithm’s ability to find a solution. The remainder of this section presents a new GP operator called alter interval. This GO was created in order to allow the GP algorithm to randomly select a node within a tree and to alter the interval in such a way as to improve the accuracy of that tree.

Since the method for creating the GPEI is random in nature it cannot guarantee that a new random interval will improve the accuracy. For this reason, hill climb- ing was incorporated in an attempt to improve the accuracy of the tree over ten attempts. At each attempt, a random interval is created and if the accuracy of the tree is improved, then the hill climbing halts and the tree is returned by the GO.

If the accuracy is not improved, then another new random interval is created until all ten attempts are exhausted. Preliminary trial runs were performed and revealed that hill climbing improved the performance of the algorithm. The pseudocode for the alter interval GO is presented in algorithm 6.2. The hill climbing mechanism is executed in steps 9 to 12 in algorithm 6.2.

The alter interval GO only selects a single node and alters that node’s interval.

The alter interval GO is a local search operator since the structure of the tree is not

CHAPTER 6. ADAPTIVE DISCRETISATION FOR GP 121

Algorithm 6.1:Pseudocode for creating an attribute node using GPEI.

input:input tree

output:input treecontaining a new attribute node

1 begin

2 Randomly select an attribute within the function set. Create the attribute random attr.

3 Allocate an arity value, arity, forrandom attr.

4 If the fixed arity method is used then set the arity based on the user parameter.

5 If the varying arity method is used then ranomly select an arity between 2 and 4 inclusively.

6 Initialise empty intervals based on the arity value determined in step 2.

7 Set the leftmost interval to have a lower bound value equal to the minimum value for random attr

8 Set currentto the value obtained in step 7.

9 random= random real number between currentand max value for random attr. randomcannot be equal to the max value for random attr

10 Create an interval between currentand random.

11 Set currenttorandom.

12 Steps 9 to 11 created the first interval. Repeat these steps to create the remaining intervals

13 Set the rightmost interval to have an upper bound value equal to the maximum value for random attr.

14 end

affected, and since the search is focused on a single node.

Algorithm 6.2:Alter interval genetic operator.

input:input tree

output: A tree with a new random interval for an attribute node.

1 begin

2 parent ←TournamentSelection();

3 parent copy ←CreateCopy(parent);

4 random node ←RandomNodeFromTree(parent copy);

5 original interval ←CopyInterval(random node);

6 if random node 6=terminal node then

7 busy←0;

8 initial accuracy←ComputeAccuracy(parent copy);

9 whilenew accuracy≤initial accuracy AND busy6= 10do

10 random node← CreateRandomInterval(random node);

11 new accuracy← ComputeAccuracy(parent copy);

12 busy←busy+ 1;

13 end

14 end

15 if new accuracy<initial accuracythen

16 random node←SetInterval(original interval);

17 end

18 end

Figures 6.4 and 6.5 illustrate a node which has been modified by the alter interval GO. From the figures, attribute 3 was selected, and it has an arity of 3. The lower bound of the first interval is not changed, neither is the upper bound of the last interval. Typically when a researcher makes use of GP to solve a problem the crossover and mutation operators are used. Since the crossover operator is a local search operator, a small percentage of its application rate can be allocated to the alter interval operator as it is also a local search operator, allowing the GP algorithm to make use of the three operators.

CHAPTER 6. ADAPTIVE DISCRETISATION FOR GP 123

Figure 6.4: Illustrating the alter interval GO. The algorithm selected attribute 3 (highlighted in grey) for modification.

Figure 6.5: Illustrating the alter interval GO. The intervals for attribute 3 were altered which resulted in three new intervals.

Dalam dokumen Data classification using genetic programming. (Halaman 140-144)