Learning algorithm - Learning slip behavior on a fixed terrain

2.9 Learning slip behavior on a fixed terrain

2.9.1 Learning algorithm

We consider the problem of learning of slip behavior as a nonlinear function approximation. That is, the slip ZT(y), i.e., fT(Z|G= y), is approximated by a nonlinear function of terrain geometry G. Previous experimental evidence shows that slip behavior is a highly nonlinear function of terrain slopes [81]. To model this highly nonlinear dependence, we use a type of Receptive Field Regression algorithm [103, 124].

The main idea is to split the input domains into subregions, called receptive fields, and apply locally linear fits to the data to approximate a globally nonlinear function.

While there are many algorithms which can be applied to this learning task, such as Neural Networks, Support Vector Regression, etc., our choice is mainly motivated by our goal of eventually allowing fast online updates. The Receptive Field Re- gression approach gives a good tradeoff between memory-based nonlinear regression methods [51] and global function approximation methods, such as Neural Networks.

Slip Z (we have dropped the subindex T for simplicity) can be written in the following form:

Z(y) =ˆ ^X^C

c=1K(y,y_c)(b^c₀+^X^R

r=1b^c_rhd^c_r,yi), (2.4) where y are the input slopes, y = (ypitch, yroll), K(y,y_c) is a weighting function, or kernel, K(u,v) = exp(−ku − vk²/λ), y_c is a training example which serves as a receptive field center, d^c_r are several local projections in each receptive field c,b^c_i are the corresponding regression coefficients, R is the number of linear projections (here R ≤ 2), and λ is a parameter which controls the receptive field size (λ > 0). In other words, the slip Z, corresponding to a query point y, is computed as a linear combination ofC linear functions (one per each receptive field), where the weights are computed according to the distance from y to the centers of the receptive fields. As the weighting functions K(y,y_c) depend on the distance from the query example y to the receptive field centers y_c, the final functional approximation will be nonlinear.

Now, given the training data D_y = {yi, zi}^N_i=1, where the vectors y_i contain the estimated slopes from range imagery, z_i are the corresponding measurements of slip

at this particular location, and N is the number of training examples, the learning procedure’s task is to estimate the unknown parameters so that they fit the training data D_y well. The parameters to be learned are the receptive field centers y_c,1 ≤ c ≤ C, the linear regression parameters b^c₀, b^c_r, d^c_r, 1 ≤ r ≤ R, 1 ≤ c ≤ C, and the parameter λ which determines the receptive fields size.

For a given λ, the receptive fields are distributed to cover the input space so that all training data belong to at least one receptive field. This is done by allocating a new receptive field in the input space whenever an incoming training example is not covered by other receptive fields, setting the centery_c to be the new example [103]. To estimate the parameters b^c_r,d^c_r in each receptive field, a Partial Least Squares (PLS) linear fit [129, 51] is performed, in which the training points are weighted according to their distance to the receptive field center [124]. In our case of only 2-dimensional inputs, one can also use the Weighted Linear Regression [103] or some other locally linear projection. However, by using PLS, the algorithm can be easily extended to working with higher dimensional inputs, because of the dimensionality reduction capabilities of PLS [124]. As our method uses the PLS regression, it is closer to the Locally Weighted Projection Regression (LWPR) method of [124]. We parameterize the receptive field size by only one parameter λ (which implies symmetric kernels).

More advanced structured kernels could be applied as in [124], but they introduce additional parameters to be learned, which would require a larger sample size. We select the parameter λ using a validation set, in order to avoid overfitting.⁷ For example, the best selectedλfor the soil dataset, collected by the LAGR robot, renders a kernel of local activity within about 4^◦ in pitch and roll angles.

An important aspect of this algorithm is that, when a new example arrives during training, only the parameters of the receptive fields in the vicinity of this example are to be re-evaluated. This allows for fast update in online learning. It is the result of constructing a final cost function in such a way that competition among receptive

7The purpose of the validation set is to test the generalization of a learned model independently of its training and select the best model possible for the data. For example, in this particular case, selecting an infinitely small receptive field size would allow for one receptive field per each example and therefore perfect approximation of the function on the training set. However, this will result in a very poor performance on examples outside of the training data.

fields is promoted, i.e., a receptive field is encouraged to fully approximate the required value of the function rather than splitting the responsibility among many receptive fields [64]. The cost function, implicit in the Receptive Field Regression algorithms, is the following:

b^c₀min,b^c_r,d^c_r

XC c=1

i=1K(yi,y_c)(zi −(b^c₀+^X^R_r=1b^c_rhd^c_r,y_ii))². (2.5) In other words, the function is required to approximate well the observed output of an arbitrary data point by an individual receptive field, rather than approximating the data point output using multiple receptive fields. As a result of optimizing this cost function, the updates to the parameters of one receptive field are done independently of the parameters of the other receptive fields. This is an important point, because when new data arrive in a subregion of the input domain, only a subset of the parameters of the function will need to be adjusted, rather than re-evaluating all the parameters of the function, as is done with Neural Networks, for example. This property also prevents the ‘catastrophic forgetting,’ typical of global approximation methods, and the algorithm does not need to store training examples in memory.

For now, the training is done in a batch mode, but the LWPR algorithm has been selected in view of future training online, onboard the rover. In particular, the properties of the receptive field regression approach that we find valuable are: the concept of a receptive field which makes keeping of huge amount of data in memory unnecessary; the adaptability of creating and removing receptive fields as needed;

and the possibility of easily extending the approach to online learning. We shall note that modeling with local nonlinear regression imposes very little restriction on the functional dependency. It allows for it to be nonlinear but does not assume any particular model, instead, the model is learned from the data. An experimental comparison with a Neural Network algorithm is given in Section 2.9.3.3.

Dalam dokumen Visual Prediction of Rover Slip: Learning Algorithms and Field Experiments (Halaman 54-57)