Indexed Memory - An investigation into the use of genetic programming for the induction of novi

Offspring 1 Offspring 1

2.8.5 Editing

3.1.1.1 Indexed Memory

According to Bruce [BRUC95], Teller [TELL94a] and Langdon [LANG98b] indexed memory is commonly represented as an integer array indexed over integers. Two functions are used to access memory, read and write, are added to the function set. The read function takes one argument, the index of the array. It returns the element stored at the index in the array.

The write function takes two arguments, an index and an integer value to be stored in the array at the index. The write function returns the current value stored at the index and stores the integer value passed to it at the index position in the array.

Teller [TELL94a] uses indexed memory as a means of enabling the GP system to save past inputs and utilize them in processes at a later stage. In his paper "The Evolution of Mental Models"

Teller [TELL94c] describes how indexed memory can be incorporated into the GP system. In the study presented by Teller [TELL94c] each individual in the population consists of a tree as well as an array of elements indexed from 0 to M -1, where M is the number of memory elements. The elements of the array are integers in the range 0 to M-I. The read and write functions described above are added to the function set. The terminals are constants between 0 and M -1 and the variables representing the system inputs. All the functions in the function set are defined to return integers in the range 0 to M-I. This ensures that each value computed is a legal memory index.

Thus, a wrapper is needed in this implementation.

In the study conducted by Langdon [LANG98b] to induce abstract data types indexed memory consists of 63 memory elements. If an individual tries to access a memory element that is not within the range -31 to 31 the program is aborted and penalized accordingly. Such an individual is not tested any further.

Experiments conducted by Teller [TELL93] indicate that systems using indexed memory performed better than systems not using indexed memory in solving the Tartarus problemll. In this particular problem the number of memory elements M was chosen to be 20. The effect of choices of smaller or larger values of M on the performance of the system was examined by Teller [TELL93]. Values less than eight or nine and values greater than 40 or 50 resulted in a degradation of system performance. Teller [TELL93] explains that in the case of a lower values of memory elements there are insufficient memory elements while in the case of a larger number of memory elements there may be too many elements and hence memory elements which are written to may not be accessed again.

lIThe Tartarus problem involves an agent that is presented with the task of pushing all the boxes from the centre of the grid to the parameter of the grid. The agent is awarded two points for every box that is pushed into a corner and one point for each box that is pushed into an edge position.

3.1.1.2. Data Structures

According to Langdon [LANG98b] data abstraction is essential to enable genetic programming to generate solutions to more complex problems. Studies conducted by Bruce [BRUC95] and Langdon [LANG98b] have revealed that genetic programming is capable of generating methods for abstract data types(ADTs). Furthermore, studies conducted by Langdon [LANG98b] have illustrated that data abstraction is more effective than indexed memory in sol ving certain problems.

The research conducted by Bruce [BRUC95] involves applying genetic programming to the induction of methods for integer array-based stack, queue, and priority queue abstract data types.

Essentially five methods were induced for each data structure, a constructor, a method to determine whether the data structure is empty, a method to determine whether the data structure is full, a method to add a data element to a data structure, and a method to remove an element from a data structure. Bruce[BRUC95] found that GP could generate the methods of the stack and queue individually. However, only four of the five methods were correctly induced for the priority queue.

In the strongly-typed GP system implemented all the methods were induced in this experiment.

Bruce[BRUC95] is of the opinion that this indicates the advantage that a strongly typed genetic programming system has over an un-typed genetic programming system. None of the experiments conducted by Bruce [BRUC95] to simultaneously induce methods were successful. Bruce [BRUC95] attributes this failure to the fact that the induction problem was too difficult to solve given the limited population size and number of generations.

However, the GP system implemented by Langdon [LANG98b] was able to successfully induce methods for the stack, circular queue and the list abstract data types. The methods for each data structure was induced simultaneously. A multi-tree structure was used to represent each program.

Fi ve methods each were induced for the stack and queue ADTs. Indexed memory formed the basis of the stack. Ten methods for an integer linked list were simultaneously induced. Thus, in this case each chromosome was composed often genes. The data structures induced by Langdon's [LANG98b] system did not cater for stack, queue or list overflow and underflow.

The data structures generated in the studies conducted by Langdon [LANG98b] were than used in the generation of the solutions to the Dyck Language problem and the Reverse Polish Expression Evaluation problems. The results obtained were compared with those obtained using a GP system in which data abstraction was replaced by indexed memory. All the runs using data abstraction correctly evolved solutions to the Dyck Language problem. However, none of the runs using indexed memory generated a solution. Similar results were obtained for the Reverse Polish Expression Evaluation problem. Thus, the problems appeared to be more difficult to solve when using indexed memory than when using data abstraction.

Based on experiments conducted by Langdon [LANG96b] and Bruce [BRUC95] it is evident that genetic programming is unable automatically evolve its own data structures as needed while simultaneously inducing the solution to a problem. Thus, abstracts data types must be developed first and then form high-level members of the function set used to induce the solution to the overall problem.

The results obtained by Bruce [BRUC95] and Langdon [LANG98b] illustrate the ability of genetic programming to generate abstract data types and the effectiveness of incorporating the use of data abstraction into a GP system.

However, the appropriateness of using abstract data types is problem dependent.

3.1.1.3 Automatically Defined Storage

Instead of presetting an architecture incorporating the use of memory, the architecture of an individual's memory structure can be automatically evolved using automatically-defined storage.

Koza [KOZA99] proposes the use of automatically defined stores (ADSs) as a means of automaticall y determining:

• The amount of internal memory that should be used.

• The type of internal memory to be used. The different types of memory include named memory, indexed memory, n-dimensional arrays where n is greater than or equal to two, and different data structures, e.g. stack, queue and lists.

• The dimensionality of the memory structures.

• How the memory is utilized.

The use of ADSs is incorporated into a system by adding two branches to each individual. One of the branches is a storage writing branch (SWB) while the other is a storage reading branch (SRB).

The function of the SWB and SRB are analogous to that of the write and read commands described above. Each ADS consists of a unique name, size, type and dimensionality. Table 3.3.1.1.3.1 lists each type of memory and its corresponding dimensionality. The type of the ADS is randomly chosen from this list of preset categories.

Dimension Types

0 Named memory, pushdown stack, queue 1 Indexed memory, list

2 Two-dimensional memory, relational memory 3 Three-dimensional array

4 Four-dimensional array Table 3.3.1.1.3.1: Memory types and dimensions

The dimensionality of the memory structure effects the number of arguments each SWB and SRB takes. This is type-dependent. For example, if the memory type is named memory, then the SWB will have one argument and SRB will have no arguments. Similarly, if the memory represented by an ADS is indexed memory SWB will have two arguments and SRB one argument. The number of elements of each indexed memory store and array-based store is chosen randomly. Each SWB and SRB in an ADS is also given a unique name, e.g. SRBO and SWBO. An example of an individual containing an ADS structure is illustrated in Figure 3.3.1.1.3.1.

Figure 3.3.1.1.3.1: ADS Example

The function performed by the SWB and SRB branches is dependent on the type of internal memory represented by the ADS. For example, if the ADS represents named memory SRB 1 takes no arguments and reads the contents of the first memory cell while SWB 1 writes the result of evaluating its argument to the first cell returning the previous contents of the cell.

However, if the ADS represents indexed memory then SWB takes two arguments, the subtree representing the value to be written to the cell and the cell address, the contents of which is returned.

ADSs can be incorporated into the population by specifying an architecture for each individual which includes ADSs during the process of initial population creation. Alternatively, architecture- altering operators can be used for this purpose. The latter option is discussed in detail in section 3.4.

Dalam dokumen An investigation into the use of genetic programming for the induction of novice procedural programming solution algorithms in intelligent programming tutors. (Halaman 81-84)