Tasks for Bootstrapping Agents - Bootstrapping Vehicles: a Formal Approach to Unsupervised Sens

CHAPTER 8

8.2. TASKS FOR DISEMBODIED AGENTS 135

8.2. Tasks for Disembodied Agents

8.2.1. Prediction

If an agent can predict the future observations (in as much as it is possible due to uncertainty), then it has learned a good model for the world. The prediction problem can be formulated in many variations, differing according to the prediction horizon, the quality of the prediction, and the way that performance is measured. These three aspects are independent of each other. Even for the simple problem of prediction one can find more than a dozen variants, each requiring different skills from the agent.

Regarding the prediction horizon, it is useful to distinguish at least between three qual- itatively different cases: “instantaneous, “short” horizon, and “long” horizon:

“instantaneous”prediction, in the sense of predicting the derivative of the observations or similar instantaneous quantity, is the easiest and can usually be done using a reduced model of the dynamics, such as a linearized model;

“short” horizon prediction will refer to predicting the future observations based on the current observations only, without accessing previous memory. Formally, this is the case wherey_tis a sufficient statistics for predictingy_t₊_∆;

“long” horizon prediction requires accessing memories of previous observations.

It is also useful to definerelaxedprediction problems, in the cases where an approximate prediction is sufficient. The following are three levels on a smooth qualitative–quantitative spectrum:

Qualitative The agent must predict which sensels will change iny_t₊_∆compared withy_t. Sign-prediction The agent must predict which sensels will change, and whether their values

increase or decrease.

This assumes that it is possible to find an order for the sensel values (compare

8.2. TASKS FOR DISEMBODIED AGENTS 136

Assumption 2 (Larger (or smaller) values are more salient)).

Quantitative The agent must predict the future observationsy_t₊_∆exactly.

Finally, one must decide how to measure performance. Here the previous discussion on error functions is relevant (compare Section 5.6).

8.2.2. Skills built on top of prediction

If the agent is able to predict the next observations, then it is possible to implement more complicated task on top of this ability (Figure 8.1).

Define an anomaly a mismatch between predictions and observations. Anomaly detection is an important skill for bootstrapping agents, as it makes the agent aware that its model of the world is incomplete.

If the agent is embodied in a robotic body, most of the changes in the observations are due to self-motion, in a way which can be learned. Occasionally some of the changes are due to other agents moving in space, or other unmodeled effects in the dynamics.

By looking for coherent temporal traces of the anomaly signal, it is possible to detect another agent moving in the same environment.

If the same sensel appears to be consistently anomalous with respect to a learned model, it is likely that it is faulty. Fault detection can be realized by averaging the anomaly signal [64].

8.3. TASKS FOR EMBODIED AGENTS 137

chasing Maximize the anomaly signal

servoing Move towards desired observations

localization Recognize the current

place from memory navigation

Move across the environment

prediction Predict the next

observations.

mapping Establish relationships among different places.

agents detection Find coherent anomaly tracks.

exploration Experience all possible stimuli

memory Cluster/compress

the observations escaping

Minimize the anomaly signal

anomaly detection Compare the predictions

with the observations.

Figure 8.1. A hierarchy of tasks for bootstrapping agents.

8.3. Tasks for Embodied Agents

Several specific tasks can be defined for agents embodied in a robotic body. To define these tasks, it is assumed that the observations represent the output of a sensor attached to the robot.

The concept of “place” can be defined directly in the observations space, as a neigh- borhood of a given observation vector:

Neighbors(y_◦) ={^y∈^Y|^d^Y(y,y_◦)<α}^.

This is a generalization of the idea of working in “image space” in applications such as visual servoing. However, care must be given as to the way one defines the metric space, because committing to a certain distance d^Y might carry over certain semantic assump- tions. An alternative definition of “place” is given by Kuipers and colleagues as part of the spatial semantic hierarchy [65, 66]: if one has already a policy, then a “place” can be defined as a subset of the states in which the policy has a predictable result.

8.3. TASKS FOR EMBODIED AGENTS 138

8.3.1. Servoing

Going to a designated place, described by some observation vector ˇy, is a task that will be used often in Part 2. The difficulty of the task essentially depends on the distance between the current state and the goal state. Just like prediction, it will be useful to distinguish between “short” and “long” horizon instances of the problem.

8.3.2. Chasing and escaping

Several tasks can be defined on top of servoing. If the agent can detect moving objects, then one can definechasingorescapingas servoing of the detection signal: for escaping the target, the detection signal should be minimized (i.e., agent gets farther), while for chasing a target, the anomaly signal should be maximized (i.e., agent gets closer).

8.3.3. Localization and mapping

Other spatial abilities can be defined on top of the concept of place and basic tasks such as servoing. Localization (which place is this?) and metrical/topological mapping (which sequence of actions brings from one place to another?) based on minimal semantics have been demonstrated by Milford [67] in a bio-plausible setting.

All these tasks form a hierarchy that goes from basic skills to complex behaviors (Fig- ure 8.1).

Part 2

Learning Models of Robotic Sensorimotor

Dalam dokumen Bootstrapping Vehicles: a Formal Approach to Unsupervised Sensorimotor Learning Based on Invariance (Halaman 148-153)