The Introduction of Dissimilar Individuals into the Population

Offspring 1 Offspring 1

4. Problems Encountered in the Implementation of GP Systems

4.1 Introns and Bloat

4.2.2.6 The Introduction of Dissimilar Individuals into the Population

Keller et al. [KELL96] and Mawhinney [MA WHOO] propose that premature convergence can be prevented by introducing new individuals into the population which are dissimilar to the existing elements of the popUlation.

Keller et al. [KELL96] attribute the lack of diversity to the population not representing a sufficient number of genotypes that make up the genospace for a particular problem, i.e. although there may not be duplicates in the population the individuals in the population may be similar. They propose an algorithm which employs a diversity checker to determine the diversity of the population after n generations. The diversity checker ascertains how many different genotypes of the genospace are represented by the current popUlation. It uses a structural measure for this purpose. This structural measure determines whether two individuals represent the same genotype by calculating the minimum number of edit operations, e.g. deleting a node, inserting a node, necessary to transform one individual into the other. If the diversity is below a certain threshold the diversity of the popUlation is restored by replacing all except one occurrence of each genotype in the population with an individual representing a genotype not represented in the population.

A similar approach is taken by Mawhinney [MA WHOa]. The GP system implemented by Mawhinney [MA WHOa] replaces a percentage of the most similar individuals in the population by newly created individuals. The system maintains two parameters for this purpose, namely, the percentage of trees to replace and how often to perform the replacement, e.g. once every generation, once every five generations. The system uses the UNIX utility diffto determine how similar two programs are. The similarity of an individual is calculated by applying the diffutility to the individual in question and every other element of the population. The similarity index is the sum of the number of lines output by diff for each comparison. The system implemented by Mawhinney [MA WHOa] was applied to two different problem domains. In both cases the system produced more successful runs. It was found that in certain domains the approach taken by Mawhinney [MA WHOa] proved to be more computationally expensive than the standard genetic programming system. The high computational effort needed could possibly be attributed to the fact that replacement individuals are merely added to the population without their similarity to the rest of population being taken into consideration and hence a replacement individual maybe more similar to the population than the original individual.

4.2.2.7 Niching Methods

Niching methods aim at enabling evolutionary algorithms to locate all possible optima by identifying all peaks or depressions in fitness landscapes of multi modal fitness functions. In doing so niching methods also assist the evolutionary algorithm from escaping local optima. According to Mahfoud [MAHF95] niches can be developed in parallel (parallel niching) or sequentially (sequential niching). Niching methods that have been used by genetic algorithm systems include crowding, fitness sharing and sequential niching.

Crowding implements the steady-state control model but instead of replacing the individual with the worst fitness, crowding replaces the individual which is most similar to the newly created offspring. Each offspring is compared to a subset of the existing population. The size of the subset is specified by the crowding factor (CF). The offspring replaces the element of the subset that it is most similar to it. The Hamming distance is used as a similarity measure. A drawback of crowding is the fact that the individual replaced by crowding may have a better fitness than its replacement or maybe a near-solution and contain components that are essential to the derivation of a solution. Ryan [RY AN96] and Mahfoud[MAHF95] report that crowding has not proven to be very successful at preventing premature convergence.

Fitness sharing algorithms penalise individuals that are phenotypically similar. The raw fitness of phenotypically similar individuals are replaced by a shared fitness. Each group of similar individuals that share a fitness value form a niche and each niche represents a peak or depression on the fitness landscape. Beasley et al. [BEAS93] state that fitness sharing assumes that optima are evenly spread over the fitness landscape and this may result in solutions algorithms being missed and the genetic algorithm converging to a local optimum. Furthermore, if the search space has many optima a fairly large population size is needed. According to Beasley et al. [BEAS93] if niches are too small the genetic algorithm is still susceptible to premature convergence. On the other hand if niches are too large a solution maybe missed.

Sequential niching attempts to locate all the optima of problems using multimodal fitness functions.

Information from previous runs is used in the current run in order to prevent the genetic algorithm from revisiting areas of the search space.

If the best individual at the end of the run is not a solution algorithm, sequential niching applies a derating function to the raw fitness of the best individual to depress (if the fitness function is being maximised) those areas of the problem space that have already been visited. Sequential niching performs multiple runs until all optima have been located. The fitness function is modified on each iteration. During the first iteration the raw fitness is used to calculate fitness measures for each individual. During successive iterations the fitness function is modified using one of two derating functions. The derating function is defined in terms of the distance between two chromosomes and a niche radius. The distance between the chromosomes is the Euclidean distance between binary vectors. Each peak or depression is represented by a niche and the niche radius determines the boundary of the niche. The niche radius is defined in terms of the dimension ofthe problem space and an estimate of the number of optima. One of the problems encountered with sequential niching is the choice of the size of the niche radius. If too small a value is chosen for the niche radius, the modified fitness function will converge to a local optimum. On the other hand if too large a value is chosen a solution may be missed altogether. Other problems encountered with sequential niching include the derivation of inaccurate solutions and halting of runs prior to the convergence of the genetic algorithm.

4.2.2.8 Weighting Fitness Cases

The system implemented by Bersano-Begey [BERS97] escapes from local optima by changing the fitness landscape. The system performs half the runs using the standard fitness function. During these runs the system ascertains which fitness cases individuals have performed poorly on. The rest of the runs are performed with a modified fitness function. The fitness function is adjusted to add a bonus to those fitness cases which the population has performed poorly on. The bonus is multiplied by the number of generations for which few or no individuals have solved the particular fitness case. When applied to the Il-multiplexor problem the system produced more successful runs than the standard genetic programming system.

In problems where the error function [KOZA92] is used to calculate an individual's fitness, the fitness function is minimised and in most cases the hits ratio, i.e. the number of fitness cases for which the individual computes the correct value, is zero. Thus, in these cases a direct comparison of how the population performs on each of the fitness cases cannot be implemented. Furthermore, there may not always be a correlation between the population's performance on the fitness cases and those genotypes missing from the population that are crucial to the induction of solution algorithms.

4.2.2.9 The Races Genetic Algorithm (RGA)

The research conducted by Ryan [RYAN96] is aimed at locating all the peaks (when maximising the fitness function) or depressions (when minimising the fitness function) in a search space. Ryan [RY AN96] views the points belonging to each peak or depression as belonging to different races. The search space is divided into a number of real-valued racial perfects. The number of racial perfects is equal to twice the number of estimated peaks or depressions in the search space. The fitness function used evaluates individuals with respect to correctness as a well as how close the individual is to an existing racial perfect. The closeness between an individual and a racial perfect is the difference between the racial perfect and the phenotype of the individual.

During the process of initial population generation an individual is created and added to a particular race, depending on its racial perfect. The races algorithm takes a steady-state approach to evolution. Upon randomly selecting a race, an individual is probabilistically chosen from the race. The algorithm employs the meta-strategy described in [RY AN96] to determine whether the individual should inbreed or outbreed. If the individual is to inbreed an individual from the same race which also wishes to inbreed is selected, otherwise an individual of a different race which also wishes to outbreed is chosen. Each newly created offspring is added to the appropriate race. If a suitable race is not found the offspring is rejected. This process continues until the termination criterion is met.

As is the case with sequential niching one of the drawbacks of the RGA is that the number of optima in the search space must be estimated. Ryan [RY AN96] states that if there are a large number of peaks or depressions in the problem space or if the estimate of the number of optima is far off from the actual number of peaks or depressions the RGA will experience difficulties.

4.2.2.10 Demes

The genetic programming paradigm has been described as "panmictic" [GRANOO], i.e. evolution is not species-based and all individuals can breed with each other. The use of demes enforces breeding between individuals that belong to the same group (deme). A small percentage of interbreeding is permitted.

Demes have been described as means of maintaining the genetic diversity of a population [LANG98b]. Demetic groupings consist of similar individuals intra-breeding in groups independently from each other. The immigration operator is used for inter-breeding between demes.

Manjunath et al. [MANJ97] describe the advantages of demetic grouping to be:

•

A number of areas within a search space is searched at once. This is especially useful when trying to find solutions to difficult problems.

The use of demes reduces the speed at which the genetic programming algorithm converges and hence prevents premature convergence of the GP system.

Demes facilitate parallelization.

Furthermore, research conducted by Langdon [LANG95a] and [LANG98b] has revealed that the division of the population into a number of sub-populations improves the performance of the system.

Langdon [LANG95a) attributes this success to the fact that the use of demes promotes the breeding of similar individuals.

Disassortive mating is a variation of the deme approach that is also used to prevent the premature convergence of the genetic programming algorithm to a sub-optimal program [GRANOO).

Disassortive mating involves maintaining two breeding pools. One pool focuses on generating good but lengthy solutions while the other pool develops poorer more parsimonious solutions.

Migration between pools is allowed.

Dalam dokumen An investigation into the use of genetic programming for the induction of novice procedural programming solution algorithms in intelligent programming tutors. (Halaman 109-113)