Background on Domain Independent Planning

7.1 Introduction

7.1.3 Background on Domain Independent Planning

The inference engine fuses information from all the diﬀerent sensors – cameras, laser range finder, tactile, force/torque sensors – to infer the state of the world. It utilizes numerous special- ized algorithms such as computer vision, Kalman filtering, etc., to perform map building and self localization [85], pose estimation [93], object segmentation and identification [139]. It can access the state of the controller, e.g., force closure criterion can indicate the grasped status of objects [105]. In order to estimate the status of the world, it may additionally utilize a predictive engine on the model of the world, and use Bayesian algorithms to find good estimates and an idea of its own estimation error [84].

While the world itself can be quite complex, the planner works from amodel of the world. The model is comprised of a state transition system with a function γ that determines how the state evolves.

It is worth reiterating the diﬀerence between the sequencer and the high level planner in Figure 7.2. Planning is significantly harder when it is a hybrid problem, i.e., when the planner must simultaneously search over sequences of discrete actions and also search for continuous paths within actions, especially since the continuous controllers eﬀect the outcome at the discrete level as well.

Although there is now a large body of work on controlling hybrid systems, and a recent push towards integrating the discrete planning problem with continuous controllers [78], these systems are usually confined to low state space dimensions. For manipulation robots, the configuration space [39] can unfortunately be very high dimensional [70], as the number of movable objects increase.

This thesis focuses exclusively on planning over finite discrete choices. The problems arising out of continuous low level controllers are assumed to be captured by the introduction of uncertainty in the model. This is in addition to the uncertainty due to actual noise that actuators introduce and also due to unexpected world behavior. Moreover the partially observability at the highest task or action planning level captures the fact that the inference engine is usually imperfect. During dexterous tasks, manipulated objects may be miss-classified, and it may be diﬃcult to detect completion of some tasks, such as successful disassembling of small parts from a fixture. POMDPs oﬀer a powerful way to model robotic tasks. POMDPs are a general and expressive model class and are now being studied extensively in the field of domain independent planning [120]. Therefore, a brief historical review of domain independent planning is provided in the next section.

Planner

Sequencer

Sequencer Status Plan

Actions

Controller Behaviors

Observations Actuators Sensors

World

Inference

World Model

Behavior Status

Figure 7.2: A Hybrid Robot Architecture.

properties of the object, e.g., grasped left hand(tool) says that some object represented by tool is grasped in the left hand. When the object represented by all variables appearing in atomic propositions are instantiated by an object or constant literal, then the proposition is calledgrounded. The opposite of grounding is calledlifting. For example, the propositiongrasped left hand(tool)describes the state of an abstract entity tool. Similarly, the operator GRASP(left hand, tool) may similarly define the eﬀect of grasping action on the abstract entity. Lifting allows compact representation of the world and its dynamics. It is independent of the instantiation of abstract entities by actual objects, which may diﬀer situationally.

Let the set of grounded atomic propositions be given by L. Then, the state space of the world model is given by all possible truth assignments, 2^L. Next, there are a finite set of operators that have two main components:

1. Preconditions: These are list of atomic propositions, that must be true or false for the operator to be applied.

2. Effects: A list of atomic propositions that become true or false as a result of applying the operator. There has historically been ambiguity as to what happens to propositions that are not listed in the effects, and most implementations assume that the remaining propositions remain unchanged. In newer representations such as PDDL [99] these effects may be conditional.

An example of an operator can be placeptool, tableq:

preconditions: graspedptoolq

eﬀects: ␣graspedptoolq “F alse, onptable, toolq “T rue.

Exact format, syntax and semantics vary across several representations [44, 47, 110]. This is another operator that is lifted, i.e., it can be used for any instantiation oftool. When all propositions appearing in operators are grounded, the operator is termed grounded and is called an action.

In classical domain independent planning, the goals are reachability goals. The planner simply needs to find a path (a sequence of actions and hence a sequence of intermediate points in the state space) that takes the initial state of the world to a given final state. The goal is also specified as a list of atomic propositions that must be true or false. Any state that agrees with the goal is a valid end state. Classical problems do not deal with finding “optimal” paths.

Since the first formalization of planning problems, several new problem domains have been added.

For example, probabilistic eﬀects as in Markov decision processes and partial observability, which is the topic of thesis, are now an important subject in the AI planning and robotics communities [107, 120]. Another addition is that of durative actions, in which time is explicitly represented.

Planning under concurrency is another important area. The goals of planning problems have also

become more complex: optimality criteria based on rewards / cost, constraint satisfaction problems and temporal goals are all now studied under the umbrella of planning.

In order to incorporate these new classes of problems, as defined by the mathematical models and goal criteria, the representation of planning problems has also evolved. Over the past decade, the AI planning community has been standardizing the representation of increasingly new classes of planning problems in a consistent fashion using the Propositional Domain Description Language (PDDL) [99] and, recently, Relational Dynamical Influence Diagrams (RDDL) [129] to incorporate lifted representations of probabilistic concurrent systems.

This thesis is concerned with probabilistic domains: specifically Partially Observable Markov Decision Processes (POMDP), which are powerful models that allow uncertain disturbances to be incorporated into the model. They are widely popular, and despite their computational complexity, the optimal POMDP policy problems for these domains have been shown to work over increasingly large state spaces in seconds [79]. In fact, infinite state space models have also been introduced [41].

In large engineering applications, such as a concurrent city wide traffic management system, several thousand actions can be taken simultaneously. It quickly becomes intractable to even represent all possible concurrent actions if they are explicitly enumerated. Similarly, many action effects only depend on a subset of state variables, as do the rewards or cost. For these domains a compact representation is crucial if they are to be applied in reality. This has led to a crucial area in (PO)MDP planning where the domain is factored [20,52]. In factored representations, the transition probabilities are conditioned over the evaluations of a subset of the state variables. In order to capture these new developments in stochastic planning, recently there has been a push to standardize the representation of problems of this nature using RDDL [129]. RDDL uses Dynamic Bayesian Networks (DBN), Algebraic Decision Diagrams (ADD) and the lifted representations which allow for compact descriptions of large domains. DBNs represent how variables at one time step affect each other at the next time step [102]. ADDs are a generalization of Binary Decision Diagrams (BDD) to allow efficient representations of functions and implementation of algorithms such as matrix multiplication, Gaussian elimination, etc. [7].

7.2 The DARPA Autonomous Robotic Manipulation Soft-

Dalam dokumen Formal Methods for Control Synthesis in Partially Observed Environments: Application to Autonomous Robotic (Halaman 128-131)