• Tidak ada hasil yang ditemukan

EVALUATING PERFORMANCE - AI VS. ENGINEERING

EXERCISES

2.5 EVALUATING PERFORMANCE - AI VS. ENGINEERING

A good engineering design is often capable of executing a job to perfection, while a good cognitive science design will work towards making the agent a situated entity, embedded into

FIGURE 2.24 ECOBOT III made at the Bristol Robotics Laboratory, it is energetically autonomous and can digest organic matter to obtain its operational energy. Image courtesy wikipedia.org, CC-by-SA 3.0 license

67

FIGURE 2.25 ARBIB, Autonomous Robot Based on Inspirations from Biology, shown here in the Zilog Z180 avatar. Image from Dampier et. al [83], with permission from Elsevier.

FIGURE 2.26 Three realms of intelligence.(1) Confluence of animal and human intelligence is probably most common and seen at work in domesticated animals, harnessing animal power, such as riding a horse, etc. (2) ANIMAT research has been an overlap of animal and artificial intelligence.

(3) Human intelligence has worked hand-in-hand with artificial intelligence in chatbots, humanoid robots and in ethical agency which is discussed in Chapter 8. (4) the overlap of all these three realms has been targeted by neuromorphic and brain-based robotics. The diagram is adapted from Yampolskiy [370].

FIGURE 2.27 The Christmas present (Toby, 2 of 4.)Dad brings Toby to the Walker household.

Cartoon made as a part of Toby series (2 of 4) by the author, CC-by-SA 4.0 license.

69 the niche of the ecosystem. Pfeifer illustrates this point by considering the task of collecting ping-pong balls. An engineering solution would be a high-powered vacuum cleaner while a cognitive science solution would be a mobile robot going around, picking up objects and identifying them to the known image of the ping pong ball. The engineering solution will be quicker and will also collect objects other than the desired one. The cognitive science solution will be slower but will collect only ping pong balls. An evaluation criterion of time will not be enough to select the better of the two methods, since a vacuum cleaner would never show adaptivity and flexibility as the mobile robot would. A similar comparison can be for a task to find the location on a map of an unknown terrain as is shown in Figure 2.28. An engineering solution would invoke surveying methods while a cognitive science solution would be Simultaneous Localisation and Mapping (SLAM). In SLAM, the robot goes around the given terrain a number of times and incrementally develops a map. An engineering approach would be fast and would involve knowing a few angles and distances, while the cognitive science approach would be slower and needs no such requirements. Pfeifer points out that mere quickness of a particular method will not fully merit its attributes.

For example SLAM can be used for terrains which cannot be accesed by human beings, viz. surveying planetary bodies and other hazardous and difficult terrains which cannot be addressed by surveying methods. While this will work on land, water and air as long as some of the angles and distances can be obtained, it will be difficult for SLAM to have this flexibility. Thus, a basis of ‘how quickly’ the solution is achieved would truly not represent the privileges attached to the method.

Various routes have been taken by researchers to evaluate the performance of an autonomous AI agent, however there is a lack of consensus on a particular approach. Since AI agents are situated, they are significantly different from conventional control systems manipulated using feedback. In conventional systems the performance is usually the quotient of the average deviation of control variables from their predicted values, the sensitivity of the controller to noise, the stability of the controller dynamics and repeatability within the realms of acceptable error. However, in situated agents, the desired robot behaviour is obtained as an emergent property of the agent-environment interactions. Therefore there need to be methods which differ significantly from the conventional types.

1. Extrapolating conventional ideas; make the robot repeat a given task a large number of times and the percentage of success will quantify performance. As an illustration, an automated robotic waiter can be evaluated on the number of times it can serve correctly without fumbling. An appreciable high percentage will confirm the consistency of performance. The obvious shortcoming is that this robotic waiter, when performing some task other than serving, will need another set of benchmarking to suit that task. Also, as a demerit, this sort of an approach will fail to cover all types of tasks, viz. unknown terrain, dynamic tasks, effective human-robot and robot-robot interactions and faulty hardware. However, these methods due to their simplicity, remain favourites among researchers and most research papers resort to such evaluations.

2. Correlate actual performance to simulation [163, 371]; though this paradigm is an oxymoron as it is meant to relate a situated phenomenon to a non-situated simulated process, even then this approach remains another favourite in the research community. Present-day software simulation methods are very sophisticated, can mimic a real environment, usually have a physics engine and yield nearly real-time performance. However, simulation still has its pitfalls as it fails to provide for realistic physics for friction, magnetic interactions, wear and tear, fracture, effects of moisture, second-order effects etc. Further, this will bring into play the benchmarking,

FIGURE 2.28 Engineering solution vs. cognitive science solution. To make a map of a given terrain, engineering approaches (shown above) would be surveying heights and distances using triangulation. Cognitive Science (shown below) will employ a mobile agency for simultaneous localisation and mapping (SLAM), to make maps.

71 performance, computational capacity and the hardware aspects of the machine on which the simulation is run, viz. RAM, data rate, CPU power, clock etc. Also, as in the previous method, this approach also fails to account for unknown terrain and dynamic tasks.

Evaluation is easier for social robots, multiple robot groups, robot swarms etc. as agency is tagged to a job at hand, strongly context related and is not arbitrary. Therefore, methods of evaluation are focused on the human-robot interactions and quality of group behaviour, respectively.