Evolutionary Data Envelopment Analysis
8.2 The Principal Idea of a Genetic Algorithm
Here the principal idea of a genetic algorithm (GA) and alternative approaches are discussed only to an extent that is required by what is presented subse- quently. Although evolutionary computation is a relatively new area,there exists a vast literature on this subject. Readers interested in the origins of evolutionary computation are advised to refer to Darwin (1859,1985),Box (1957),Holland (1962,1975),Rechenberg (1993),De Jong (1975) and Schwefel (1994).
GAs were initially developed by Holland (1975). They provide a search and optimization procedure that is motivated by the principles of natural genetics and natural selection. Some fundamental ideas of genetics are borrowed from the genetic processes of biological organisms. Darwin’s principle of natural selection (‘reproduction and survival of the fittest’) and its adaptation by GAs made this procedure very famous. Although this slogan seems to be slightly tautological in the natural environment (Dawid,1999),it is very useful in optimization problems,where the fitness is defined as the value of a function to be optimized.
The working principles of GAs are very different from those of most traditional optimization techniques. They transform a population of individual objects,each with an associated fitness value,into a new generation of the population using the Darwinian principle and analogies of naturally occurring genetic operations,such as crossover (sexual recombination) and mutation.
Other biological phenomena,such as the duality of genes or the existence of dominant and recessive genes,are up to now (2001) not considered by evolutionary algorithms. However,the importance of these processes for evolution in nature is still not completely known in a biological sense and
1 4Thought, Right Information Systems Ltd, London.
the choice of the selected features that are transferred to GAs is due mainly to implementation problems (Dawid, 1999).
A genetic algorithm is an iterative procedure that operates on a constant- sized population of individuals,each one represented by a finite string of symbols,known as the chromosomes,encoding a possible solution in a given problem space. This space,referred to as the search space,comprises all possible solutions to the problem at hand. Each individual in the population represents a possible solution in this search space. The genetic algorithm attempts to find a satisfactory solution to the problem by genetically breeding the population of individuals over a series of many generations. Generally speaking,the genetic algorithm is applied to spaces which are too large to be exhaustively searched.
An introductory overview on evolutionary computation including genetic algorithms,evolutionary programming,evolution strategies,genetic classifier systems and genetic programming is provided by Bäck and Schwefel (1993) and Koza (1997). Concerning GA,John Holland’s pioneering bookAdaptation in Natural and Artificial Systems(1975) showed how the evolutionary process can be applied to solve a wide variety of problems using a highly parallel technique. Until today,though,the main textbook reference for many GA researchers is David Goldberg’sGenetic Algorithms in Search, Optimisation, and Machine Learning(1989). Additional textbook information on GAs can be found in Koza (1992),Kinnebrock (1994),Mitchell (1996),Michalewicz (1996), Banzhafet al. (1998), Dawid (1999) and Michalewicz and Fogel (2000).
8.2.1 GA approaches for constraint handling
Several approaches have been proposed for solving general non-linear programming problems through GAs (Powell and Skolnick,1993; Joines and Houck,1994; Michalewicz and Schönauer,1996). Most of them are based on the concept of penalty functions,which penalize infeasible solutions, for handling non-linear programming problems. Although several ideas have been proposed about the design of the penalty function,this method has several drawbacks which led to disappointing results in several experiments as pointed out by Michalewicz and Schönauer (1996),Michalewicz (1996) and Michalewicz and Fogel (2000).
The EDEA design is a revised version of Michalewicz’s GENOCOP system, a GA for solving general linear programming problems by avoiding the drawbacks of the penalty methodology. One of the distinct characteristics of GENOCOP is its floating value representation of the chromosomes. One of the drawbacks of traditional GAs using a (binary) coding scheme is that a proper coding of the problem to be addressed needs to be used. When using GENOCOP to solve multiparameter optimization problems,this is not needed as in this algorithm a string is composed of a set of real values. DEA parameters are numeric,so representing them directly as numbers,rather than bit-strings,
seems obvious and may have advantages. Janikow and Michalewitz (1991) made a direct comparison between binary and floating-point representations, and found that the floating-point version gave faster,more consistent and more accurate results.2
With GENOCOP,Michalewicz was the first to show that the floating point representation in a GA implementation can be faster,more consistent from run to run and provides a higher precision specifically in large domains where binary coding is rather inefficient.
In 1992,when Michalewicz introduced his original system,he also introduced a general non-linear programming version,GENOCOP II. Later, in GENOCOP III he incorporated the original GENOCOP system for linear constraints,but extended it by maintaining two separate populations, where a development in one population influences evaluations of individuals in the other population (Michalewicz and Nazhiyath,1995; Michalewicz and Schönauer,1996). GENOCOP III was recently tested by Sakawa and Yauchi (1999) for multiobjective non-convex programming problems. Taplin and Qiu (1997) were the first to apply the original GENOCOP system to a tourism research application.
8.2.2 GA extensions for multiobjective problems
The Vector Evaluated Genetic Algorithm (VEGA),an early GA application on multiobjective optimization by Schaffer (1984,1985),opened new possibilities of research in this field. In his work,Schaffer tried to capture all Pareto optimal solutions of a multiobjective optimization problem. His main idea was to divide the population into equal-sized subpopulations,each subpopulation responsi- ble for a single objective. The selection procedure was performed independ- ently for each objective,but crossover was performed across subpopulation boundaries. Additional heuristics were developed to decrease a tendency of the system to converge towards individuals which were not best with respect to any of the objectives.
This new special research area within GA was extensively discussed and extended by many other researchers (e.g. Hornet al.,1994; Fonseca and Fleming,1995; Srinivas and Deb,1995; Weileet al.,1996; Cheng and Li, 1997; Zhou and Gen,1999). Recently,Loughlin and Ranjithan (1997) presented a new multiobjective genetic algorithm called the neighbourhood constraint method (NCM) that uses a combination of a neighbourhood selec- tion technique and location-dependent constraints. The authors demonstrate the NCM for complex,real-world problems,and their results show that NCM performs better than several other techniques including integer programming, single-objective GA and implementations of a Pareto and hybrid niched-Pareto
2 InGA-digestVolume 6 number 32 (September 1992), the editor, Alan C. Schultz, lists various research using non-binary representations.
multiobjective GA. However,neither Loughlin and Ranjithan (1997) nor any other author working in the field of multiobjective GA have so far dealt with the measurement of efficiency for a specific object in a given data set.3This alternative way of identifying Pareto-optimal solutions will not be discussed here. Nevertheless,it must be admitted that such new research initiatives may provide very different alternative approaches for identifying comparative partners in the future.