2.12 Summary
2.12.1 Limitations and future work
analysis performed and the method proposed is a step in the right direction, in the sense that—although not always precise—it does provide information about slip at a distance, rather than driving blind, and is capable of alerting the rover of potentially dangerous areas. Thus it will be a valuable tool to have.
The proposed slip prediction from visual information method does not undermine the merit of mechanical modeling of terrain. Instead, it exploits sensory information, e.g., vision, which is unavailable or not yet utilized by mechanical models and thus can be complementary to them. For example, the proposed slip prediction is limited by the macro-scale level of processing, i.e., assuming uniform terrain type and constant slope under the rover footprints. A more detailed mechanical modeling [65] can consider smaller scale events, such as terrain undulations under the rover footprints, presence of rocks or presence of different terrain types under different rover wheels, and can provide more precise estimates of slip in these cases. The latter method, however, cannot be done at a future location. An important area of research would be how to combine these two different levels of information (visual input and more detailed mechanical modeling).
Some extensions of the algorithm can include deriving confidence intervals for terrain classification algorithm and respectively for the final slip prediction so that a planner can take advantage also of the certainty of the predictions. This can be achieved by formulating the terrain classifier in a probabilistic framework [122].
Furthermore, visual information might not be sufficient to distinguish various terrain types and properties, especially considering Mars terrains. It can be comple- mented with multispectral imaging or other sensors to resolve some inherent visual ambiguities and improve on the classification results. A more advanced algorithm which imposes spatial and temporal continuity constraints on the classification over neighboring patches or dependent on terrain geometry is another important extension of the algorithm.
The terrain classification algorithm itself is also limited in its scope. That is, the proposed computational method looks at small terrain areas and classifies them based on similarities to other observed patches. Since there is a lot of variability
or ambiguity in the appearance, looking beyond a single map cell at a larger scale context can provide more insights into the soil type. For example, area which is part of a dune-like or ripple-like formation is very likely to contain finer material which has less friction, and therefore is likely to cause more slippage. To be able to do that, a more advanced understanding of the scene and more intelligent inference is needed.
Another orthogonal future direction for the terrain classifier can come from im- proving the efficiency of the processing. This is important, because this algorithm needs to run in real-time onboard a rover. A method which can utilize the available features more efficiently and also take advantage of misclassification costs to speed up the classification is proposed in Chapter 5.
And finally the most necessary future direction for autonomous robots is to be able to fully automate the learning so that the robot can achieve maximum auton- omy. This is important because for planetary exploration there is significant delay, and there is not sufficient time or resources to downlink images and manually label them. We propose learning algorithms which can use automatically obtained signals as supervision and do not require manual labeling in Chapters 3 and 4.
Chapter 3
Learning from automatic supervision
In this chapter we consider the problem of learning rover slip from visual information in a fully automatic fashion, using the robot’s onboard sensors as supervision. In particular, we are going to use the robot’s slip measurements as supervision to the vision-based learning (Figure 3.1). The intuition is that two visually similar terrains which might not be normally discriminated in the visual space, will be discriminated after introducing the supervision. The motivation for using the slip measurements as supervision, instead of manual labeling of terrain types, is that they are obtained automatically and effortlessly by the robot’s sensors.
A probabilistic framework for learning from automatic supervision is proposed [8].
In this framework, visual information and the mechanical supervision interact to learn to recognize terrain types. The method is designed to be able to work with supervision signals which are ambiguous or noisy, as is the case with rover slip.
We have tested the algorithm on data collected in the field by the LAGR robot while driving on soil, gravel, and asphalt. Our experiments show that using mechan- ical measurements as automatic supervision significantly improves upon vision-based classification alone and approaches the results of learning with manual supervision.
This work not only enables the rover to recognize terrain types and predict their slip characteristics, but also to learn autonomously about terrains with different mobility limitations, or slippage, using only its onboard sensors.
3.1 Introduction
In Chapter 2 we have proposed a method for learning and prediction of rover slippage, in which the learning of terrain classification and of slip models is done independently in an offline fashion, using human supervision for providing the ground truth for the terrain type. In this chapter we remove the requirement for human supervision, using automatically obtained onboard sensor signals as supervision instead.
Although most navigation systems are targeted towards full vehicle autonomy, they rely mainly on offline training and use heuristics or human supervision to de- termine the traversability properties of a terrain type [77]. However, the ultimate goal in autonomous navigation is to have a robot which is able to learnautonomously about different terrains and its mobility restrictions on them. For example, it is not practical to stop the exploration of a planetary rover, in order to downlink and label training data. Moreover, providing ground truth manually is prohibitive because of the huge volume of data available, and because using expert knowledge is expensive and might be unreliable. For example, a human operator might not have the best knowledge about soil characteristics and their influence on rover mobility. To auto- mate the training process we use the vehicle’s low-level mechanical sensors, which measure its slip behavior, to provide supervision for the learning of terrain type and its mobility characteristics from visual information.
We consider the case in which the supervision is based on slip measurements taken by the robot, which can be noisy and ambiguous. To solve the problem in this setup, we propose a probabilistic framework in which the slip supervision provided by the robot is used to help learn the classification of terrain types in the visual space automatically. We call this scenario learning from automatic supervision. We show that learning with this weaker form of supervision is more useful than ignoring the supervision and that it can bridge the gap to the performance achieved by manual supervision.
In the proposed probabilistic framework the visual and the mechanical sensory information interact, classification in the input space is learned, and the parameters
Figure 3.1: An autonomous vehicle uses stereo vision to obtain information about terrain ahead (top left). Measurements about the mechanical vehicle-terrain interac- tion, or slip, can be collected at different traversed locations by the vehicle’s onboard sensors (top right and bottom left). These measurements can be used as supervision to infer that the terrain types are different (bottom right) and to also learn their inherent slip properties for the purposes of future slip prediction.
of both the terrain classification and the mechanical slip behavior are estimated. The problem is formulated as a maximum likelihood estimation in which the Expecta- tion Maximization (EM) algorithm [32] is used to learn the unknown parameters.
The method in an extension to unsupervised clustering, e.g., Mixture of Gaussians (MoG) [24], but allows for noisy or ambiguous mechanical measurements to act as supervision to the visual information. This is, to our best knowledge, the first work which considers the problem of learning about terrain types when the automatic supervision is in fact a sensory measurement from the vehicle, which can be ambigu- ous and noisy.
The proposed approach is also applied to remote prediction of potential rover slip on forthcoming terrain. The significance of the approach is that a fully automatic learning and recognition of terrain types can be performed without using human supervision for data labeling.