Methods - TOURNAMENT SELECTION USING - REDUCTION TECHNIQUES FOR GENETIC PROGRAMMING

Chapter 2. TOURNAMENT SELECTION USING

3.3. Methods

reducing GP code bloat.

f(θ) = ^Pⁿ_i=1(θ·q_i −s_i)² with respect to θ. The quadratic function f(θ) achieves the minimal value at the vertex, θ^∗ calculated in Equation 3.1:

θ^∗ = Pn

i=1q_is_i Pn

i=1q_i² (3.1)

After finding θ^∗, newT ree= θ^∗·sT ree is grown, and this tree is called the approximate tree of the semantic vector s.

Figure 3.1: An example of Semantic Approximation

Figure 3.1 presents an example of growing a tree in the form: newT ree= θ·(1+X) using SAT. In this figure, X = {0.1,0.2,0.3}is the fitness cases of the problem, s = (0.5,0.6,0.7) is the target semantics, and θ^∗ ≈ 0.5.

Compared to the previous techniques [79, 93], SAT has some advan- tages. First, SAT is easier to implement and faster to execute than the techniques in [79, 93]. Subsequently, GP systems based on SAT will run faster. Second, GP operators using SAT will create the population of higher diversity than the operators that only search for the subtree in other individuals [79] or from a pre-defined library [93]. Thus, SAT-based

operators may achieve better performance than SSC [79] and RDO [93].

Last but not least, the size of newT ree can be constrained by limiting the size of sT ree, and this will be used for designing two approaches to reduce GP code bloat in the next subsections.

3.3.2. Subtree Approximation

Based on SAT, we propose two techniques for reducing code bloat in GP. The first technique is called Subtree Approximation (shortened as SA). At each generation, after applying the genetic operators to generate the next population,k% largest individuals in the population are selected

Algorithm 6:Subtree Approximation

Input: Population size: N, Number of pruning: k%.

Output: a solution of the problem.

i←−0;

P0 ←−InitializeP opulation();

Estimate fitness of all individuals in P⁰; repeat

i←−i+ 1;

P^′i ←−GenerateN extP op(Pⁱ⁻¹);

pool ←−get k% of the largest individuals of P^′i; Pi ←−P^′i−pool;

foreach I^′ ∈pool do

subT ree←−RandomSubtree(I^′);

S ←−Semantics(subT ree)

newT ree←−SemanticApproximation(S);

I ←−Substitute(I^′, subT ree, newT ree);

Pi ←−Pi∪I;

Estimate fitness of all individuals in Pi; until Termination condition met;

return the best-so-far individual;

for pruning. Next, for each selected individual, a random subtree is chosen and replaced by an approximate tree of smaller size. Algorithm 6 presents this technique in detail.

In Algorithm 6, functionInitializeP opulation() creates an initial population using the ramped half-and-half method. Function Generate- N extP op(_P_i−1) generates the template population using genetic operators (crossover and mutation). The next step selects k% of the largest individuals in the template population (_P^′_i) and stores them in a pool.

The loop after that is used to simplify the individuals in the pool.

For each individualI^′ in the pool, a random subtreesubT reeinI^′is selected using function RandomSubtree(I^′), and a small tree sT reeis randomly generated ¹. The semantics of subT ree is calculated and assigned toS (S becomes the target semantics) with functionSemantics(subT ree).

AnewT ree = θ·sT ree is grown to approximateS by using the semantic approximation technique in function SemanticApproximation(S). Fi-

(a) (b) (c)

Figure 3.2: (a) the original tree with the selected subtree, (b) the small generated tree, and (c) the new tree obtained by substituting a branch of tree (a) with an approximate tree grown from the small tree (b).

1To reduce bloat, the size ofsT reeneeds to be smaller than the size ofsubT ree−2. If this condition is not satisfied, the process of selectingsubT reeand generatingsT reeis repeated. If no pair ofsubT reeandsT reeis found after 100 trials, individualI^′is added to the next generation.

nally, individualI^′ is simplified to form individualI by replacingsubT ree with newT ree. Individual (I) is added to the next generation, _Pi. Fig- ure 3.2 illustrates how to prune an individual by using SAT. The population at the generation i is then evaluated using the fitness function, and the whole process is repeated until the termination condition meets.

3.3.3. Desired Approximation

The second technique attempts to achieve two objectives simultane- ously: lessen GP code bloat and enhance its ability to fit the training data. This technique is calledDesired Approximation (shortened as DA).

Algorithm 7:Desired Approximation

Input: Population size: N, Number of pruning: k%.

Output: a solution of the problem.

i←−0;

P⁰ ←−InitializeP opulation();

Estimate fitness of all individuals in P0; repeat

i←−i+ 1;

P^′i ←−GenerateN extP op(Pⁱ⁻¹);

pool ←−get k% of the largest individuals of P^′i; Pi ←−P^′i−pool;

foreach I^′ ∈pool do

subT ree←−RandomSubtree(I^′);

D←−DesiredSemantics(subT ree);

newT ree←−SemanticApproximation(D);

I ←−Substitute(I^′, subT ree, newT ree);

Pⁱ ←−Pⁱ∪I;

Estimate fitness of all individuals in Pⁱ; until Termination condition met;

return the best-so-far individual;

DA is similar to RDO [93] in which it first calculates the desired semantics for a randomly selected subtree in an individual using the semantic backpropagation algorithm [53, 93]. However, instead of searching for a tree in a pre-defined library, DA uses SAT to grow a small tree that is semantically approximate to the desired semantics. Algorithm 7 thor- oughly describes DA.

The structure of Algorithm 7 is very similar to that of SA. The main difference is in the second loop. First, the desired semantics of subT ree is calculated by using the semantic backpropagation algorithm instead the semantics of subT ree. Second, newT reeis grown to approximate the desired semantics D of subT ree instead of its semantics S. By replacing subT ree by a newT ree that is semantically approximate to the desired semantics, it is predicted that the individual will be closer to the optimal value.

Dalam dokumen REDUCTION TECHNIQUES FOR GENETIC PROGRAMMING (Halaman 101-106)