Dept. of Computer Science and Engineering, IIT Kanpur

(1)

www.PosterPresentations.com

• The paper being reviewed was authored by Zuluaga, Krause, Sergent and Puschel of ETH Zurich, and presented at ICML 2013.

• It proposes a solution for the problem of mutli-objective

optimization, a problem which is frequently encountered in many fields, including science and engineering.

• The problem is to identify those designs from a given set that simultaneously optimize multiple parameters.

• Usually there is no one optimal design but an entire set of Pareto-optimal ones with optimal trade offs in the objectives.

• Often the evaluation of each design is so expensive, that it is not feasible to evaluate each design to arrive at the Pareto-optimal set.

• The authors propose an active learning based algorithm to

approximately evaluate the Pareto optimal set, while minimizing sampling cost.

• Active learning is a form of semi-supervised learning where the algorithm intelligently and iteratively chooses the data points to be labeled. The aim is to minimize the sampling cost while

simultaneously trying to achieve a minimum accuracy.

INTRODUCTION

MAIN CONTRIBUTIONS

• All the f_i are non-negative (if the minimum is known, this property can be established through suitable shifting).

• Each objective function f_i is modeled as a sample from an independent Gaussian process distribution. The Gaussian process distribution generalizes the Gaussian distribution for random variables to functions. It essentially specifies a

distribution over a function space. It is non parametric and can be specified through a mean function (m) over E and a

covariance function (k) over E x E.

• The noise in the evaluation of the objective function at the points chosen by PAL, is of the i.i.d. Gaussian type.

ASSUMPTIONS

THEORETICAL ANALYSIS

• Experiments were carried out over three data sets

• In most cases PAL is seen to require 33% fewer evaluations to achieve a desired accuracy as compared to ParEGO, a state of the art multi-objective optimization method based on

evolutionary algorithms.

CONCLUSION

In this paper the authors addressed the challenging problem of predicting the set of Pareto-optimal solutions in a design space from the evaluations of only a subset of the designs. We use

Gaussian processes to predict the objective functions and to guide the sampling process in order to improve the prediction of the

Pareto optimal set.

The main contributions of the author’s include:

• The PAL (Pareto Active Learning Algorithm) which efficiently (i.e., with few evaluations) identifies the set of Pareto-optimal designs in a multi-objective scenario with expensive

evaluations, and which allows user control of accuracy and sampling cost

• The analysis of PAL that provides theoretical bounds on the algorithm’s sampling cost to achieve a desired target accuracy

• An experimental evaluation to demonstrate PAL’s effectiveness on three real-world multi-objective optimization problems.

Dept. of Computer Science and Engineering, IIT Kanpur

Sidharth Gupta

Active Learning for Multi-Objective Optimization

BACKGROUND

The commonly used notations in this poster are:

• E is the design space (a finite subset of R^d)

• f₁,f₂,…,f_n are the n objective functions

• f is the vector of all objective functions

• f(E) is the objective space (subset of Rⁿ)

• P is the set of Pareto optimal points (subset of E)

• P’ is the set of Pareto optimal points predicted by PAL

• The hypervolume error between f(P) and f(P’) is used as a measure of the quality of prediction

PAL ALGORITHM

• The algorithm is iterative in nature.

• In each iteration, we first carry out train the GP model with the subset of E over which we have evaluated the objective

function. This essentially gives us a posterior distribution for f from a prior distribution (which is specified initially) and the values of f at certain data points.

• Using this GP model we determine f(x) over all points x in E over which we are yet to be evaluated, getting a multi-variate Gaussian distribution. The uncertainty in the value of f(x)

reflects the uncertainty in f for which the GP only specifies a distribution over the function space.

• This uncertainty information is used to guide our sampling and to make a probabilistic assumption on the optimality of every point x.

• Once a point has been classified as Pareto optimal or non-

Pareto optimal it is never reclassified. The program terminates when all points have been classified.

EXPERIMENTAL RESULTS

REFERENCES

• Knowles, J. ParEGO: a Hybrid Algorithm with Online Landscape Approximation for Expensive Multiobjective Optimization Problems. IEEE Trans. On Evolutionary Computation, 10(1):50 – 66, 2006.

• Rasmussen, C.E and Williams, C. K. I. Gaussian Processes for Machine Learning. MIT Press, 2006.