• Tidak ada hasil yang ditemukan

Initial Population Generation

Once a suitable representation has been decided upon, an initial population can be created. The initial population is referred to as generation zero. Several methods exist for the creation of the initial population. Three common methods for creating the initial population are the full method, the grow method, and finally the ramped half and half method. Additional initial population generation methods for GP trees were examined by Luke and Panait [12].

In order to maintain genetic diversity no duplicates should be created when initialising the population; this is done in order to represent as much of the pro- gram space as possible. Koza describes duplicate individuals in generation zero as

“unproductive deadwood” [3].

If only a small portion of the program space is being represented then the GP algorithm may converge prematurely to a local optimum. However, if a sufficient amount of the program space is represented then there is a greater chance of converg- ing to the global optimum. Incidentally, if the program space being represented is too large, then this can hinder GP’s ability to converge towards the global optimum.

Before the initial population generation is described, the term depth needs to be defined. The depth of a node is the distance from the root node to that particular node. The root node has a depth of 1. Themaximum depth of a tree is the distance

CHAPTER 2. GENETIC PROGRAMMING 11 - in terms of the nodes - from the root of a tree to the bottom-most leaf. From figure 2.3, the tree has a maximum depth of 4. When creating the initial population for tree structures, a maximum depth must be specified in order to limit the size of the trees when they are created.

Figure 2.3: Illustrating the depth of each node within a tree.

2.7.1 Full method

Each individual created by the full method has the maximum possible size, in the sense that the distance from the root node to each leaf is equal to the maximum tree depth [3]. When creating a tree using the full method, provided that the maximum depth has not been reached, an element from the function set is always selected.

The leaves are made up of elements of the terminal set. By using the full method, the distance between all of the leaves and the root is equal to the maximum depth, and consequently, all the trees in the population have the same depth. In figure 2.4, the tree on the left illustrates the result of applying the full method when creating a tree.

Depending on the arity of the functions, the trees might not all have the same number of nodes. For instance, if the full method is used to create a tree using functions of arity 2, then this particular tree will have less nodes than a tree created using functions of arity 3. Regardless of the arity of the functions, all the leaves within all of the trees will be at the same depth.

A consequence of the full method is that a large quantity of the trees in the initial population will have a similar structure. Consequently, the initial population is less diverse due to the similarity in tree structures. This lack of diversity can result in the GP algorithm searching a restricted area of the program space, and

thus hindering the performance of the algorithm.

2.7.2 Grow method

The grow method creates trees of different shapes and sizes [3]. When creating a tree using this method, at each depth an element from the terminal set or from the function set can be randomly selected. However, at the maximum depth, only an element from the terminal set can be selected. In figure 2.4, the tree on the right illustrates a tree which was created using the grow method.

This method benefits from the fact that trees of different sizes are created which will result in greater diversity as opposed to the full method. Although the grow method results in greater diversity, the method also suffers from the randomness involved in creating the trees, which is highlighted by Poli et al. [1]. Consider the 6-multiplexer problem; this problem has 6 input variables. In order to solve this problem using a tree representation, all of the variables should be used. It is possible, due to the randomness involved when creating the nodes, that the trees created using the grow method are not sufficiently large enough to make use of all the variables.

Figure 2.4: A tree created using the full method (left), and a tree created using the grow method (right).

2.7.3 Ramped half and half

The ramped half and half method creates half of the initial population using the grow method, and the other half using the full method [1]. Let size denote the population size, and letd denote the maximum depth. Thus at each depth, a total of sized−1 individuals are to be created. Of these sized−1 individuals, half of those must be created using the grow method, and the other half using the full method. This method of initial population generation has been proven successful and is commonly used [1]. The reason behind this is that genetic diversity is maintained since trees

CHAPTER 2. GENETIC PROGRAMMING 13 of different shapes and sizes are created [3].

For example consider a population size of 6 and a maximum depth of 4 which is illustrated in figure 2.5. Half of the population must be created using the grow method, and the other half using the full method. Thus, sincesize = 6 and d = 4 a total of 2 individuals are to be created at each depth. Thus, the individuals are created as follows:

• At depth 2: one tree using the grow method, one tree using the full method.

• At depth 3: one tree using the grow method, one tree using the full method.

• At depth 4: one tree using the grow method, one tree using the full method.

From the figure, the trees on the left represent those created using the grow method, and the trees on the right were created using the full method. At a depth of 4, the tree created using the grow method is much smaller than the one created using the full method. This is because after the root was set to a function node, the next created node was randomly assigned to an element of the terminal set.

Figure 2.5: Illustrating an initial population created using the ramped half and half method.