CHAPTER 11. RESULTS AND DISCUSSION 173 Arithmetic-
without-if
Arithmetic- with-if
Logical- with- between
Logical- without- between
Decision Trees
Climate 26.88 28.42 33.94 37.72 35.32
Fertility 26.64 28.34 28.36 21.54 19.76
Ionosphere 35.56 37.10 45.18 41.90 25.04
Mammo- graphic
30.32 37.38 46.40 45.26 44.64
Monk2 43.48 70.30 30.58 37.24 65.66
Parkinsons 29.12 33.12 26.68 21.04 26.00
Pima Indians
41.08 49.20 31.90 43.14 51.24
Sonar 30.28 38.48 50.36 53.30 47.60
Spectf 33.16 43.46 45.72 36.68 38.60
WDBC 25.72 32.70 30.84 30.06 28.84
Average 32.22 39.85 37.00 36.79 38.27
Table 11.9: Average size (number of nodes) of the different representations. The smallest result for each data set is highlighted in bold.
Arithmetic-with-if resulted in the largest average size, with a value of 39.85 nodes. The smallest average size was obtained by the arithmetic-without-if. By in- cluding theif statement, the average tree size was always larger than those without it, with an average increase of 7.63 nodes. Although the if statement increased the overall complexity of the trees, this did not hinder the performance of the represen- tation, since there was an average improvement in both training and testing when compared toarithmetic-without-if. Including thebetween statement did not increase the average size of the trees as drastically as the inclusion of theif statement. Adding thebetween statement increased the average tree size by an average of 0.21 nodes.
Decision trees resulted in the second largest average size with an average of 38.27 nodes. This represents an average of 6.05 nodes larger than arithmetic-without-if.
Despite the decision tree representation being larger than arithmetic-without-if, it obtained a higher average performance.
Data sets
Standard GP without encapsulation
GP encapsulation
method
Selective encapsulation
Climate 96.86 96.88 † 96.71 †
Ecoli 89.63 90.67 ** 90.26 **
Fertility 95.58 95.33 ** 94.76 **
Glass 78.31 81.22 ** 79.58 **
Ionosphere 96.42 98.17 ** 97.37 **
Iris 99.21 99.57 ** 99.61 **
Pima Indians 81.92 83.35 ** 82.15 **
Sonar 94.70 96.45 ** 94.99 **
Spectf 91.15 93.20 ** 91.91 **
WDBC 97.52 97.93 ** 97.64 †
Wine 99.63 99.63 † 99.46 †
Yeast 59.03 61.55** 61.01 **
Average 90.00 ± 11.83 91.16 ±11.10 90.45 ±11.28 Table 11.10: Training accuracy (%) results for standard GP and the two proposed encapsulation methods. For each data set, the best result is highlighted in bold, and the results for the encapsulation methods were statistically tested with the results obtained by standard GP. A “**” indicates that the result is statistically significant when compared to the result obtained by standard GP without encapsulation. A
“†” indicates a stastically insignificant result compared to standard GP without encapsulation.
In terms of training, the standard GP algorithm obtained the lowest overall accuracy and the highest standard deviation, but obtained the best result on 2 data sets, namely Fertility and Wine. In the case of the Fertility data set, standard GP obtained statistically significant results when compared to the encapsulation methods; however, on theWinedata set the best result was statistically insignificant in comparison to the other approaches. The GP encapsulation method obtained an overall average of 91.16% across all the training data sets, which represents an improvement in accuracy of 1.16% over standard GP. The GP encapsulation method obtained the lowest standard deviation on the training data. Furthermore, this method obtained statistically significant results on 10 data sets when compared to standard GP, 8 of which were the best result. On the remaining 2 data sets,Climate and Wine, it also achieved the best the results but were statistically insignificant.
CHAPTER 11. RESULTS AND DISCUSSION 175
Data sets
Standard GP without encapsulation
GP encapsulation
method
Selective encapsulation
Climate 90.78 90.70 † 90.78 †
Ecoli 82.83 82.48 † 82.55 †
Fertility 81.00 79.60 ** 80.80 †
Glass 63.17 64.92 † 64.17 †
Ionosphere 90.93 90.42 † 90.32 †
Iris 95.07 94.67 † 94.67 †
Pima Indians 73.62 73.02 † 72.78 **
Sonar 73.00 72.41 † 72.41 †
Spectf 76.58 75.98 † 77.11 †
WDBC 93.99 94.06 † 94.16 †
Wine 89.41 90.94 † 90.09 †
Yeast 55.69 56.12 † 56.62 **
Average 80.51 ± 12.54 80.44 ± 12.36 80.54 ± 12.28 Table 11.11: Test accuracy (%) results for standard GP and GP with the two pro- posed encapsulation methods. For each data set, the best result is highlighted in bold, and the results for the encapsulation methods were statistically tested with the results obtained by standard GP. A “**” indicates that the result is statistically significant when compared to the result obtained by standard GP without encap- sulation. A “†” indicates a statically insignificant result compared to standard GP without encapsulation.
Selective encapsulation made use of the maintained list and this approach ob- tained an overall training average of 90.45% which is better than standard GP by 0.45%. Selective encapsulation obtained the best result on the Iris data set, and this result was statistically significant when compared to standard GP. Although selective encapsulation did not obtain the best result on as many data sets as the GP encapsulation method, selective encapsulation obtained a statistically better re- sult on a total of 9 data sets when compared to standard GP. The findings reveal that both of the GP encapsulation methods obtain a better training accuracy than standard GP.
In terms of the test results, the selective encapsulation approach obtained the highest overall accuracy with a value of 80.54% and obtained the best result on 3 data sets, this however was only statistically significant on the Yeast data set.
Selective encapsulation outperformed standard GP on 5 data sets, and furthermore obtained the lowest standard deviation. The GP encapsulation method obtained the best result on 2 data sets, and outperformed standard GP on 4 data sets; these were however not statistically significant. The GP encapsulation method obtained the lowest overall test accuracy. Standard GP obtained the best result on 7 test data sets, and achieved an overall average of 80.51%. This method ranked second
best, however, obtained the highest standard deviation.
Based on the test data, the three methods obtained similar performance, how- ever, the selective encapsulation method obtained the highest average accuracy along with the lowest standard deviation.
After each GP run, the number of encapsulated terminals which were present in the best GP tree was recorded. Table 11.12 presents the average number of encapsulated terminals which were found in the best GP individuals for each of the data sets. Selective encapsulation made use of the maintained list, whereas the GP encapsulation method did not. When the maintained list was not used the average number of encapsulated terminals in the best GP tree was 48.46. There was a considerable reduction when the maintained list was used, with an average of 28.64 encapsulated terminals. Furthermore, these results were statistically significant for every data set thus confirming that the selective encapsulation method reduces the number of encapsulated terminals in the trees. Reducing the number of encapsulated terminals implies that the complexity of the tree is also reduced.
Data set Without Maintained List
With Maintained List
Climate 45.74 27.60 **
Ecoli 49.98 30.84 **
Fertility 32.22 17.88 **
Glass 52.04 32.52 **
Ionosphere 45.56 24.64 **
Iris 24.14 11.92 **
Pima Indians 72.36 41.18 **
Sonar 55.44 35.68 **
Spectf 61.24 32.50 **
WDBC 40.72 23.18 **
Wine 28.12 18.52 **
Yeast 73.98 47.27 **
Average 48.46 28.64
Table 11.12: Comparison between the number of encapsulated terminals which were present in the best GP individuals for the two encapsulation methods. All the results obtained by encapsulation with maintained list were statistically significant compared to when the list was not used, this is denoted by “**”.