II.2 Surrogate Modeling Techniques
II.2.3 Analysis Frameworks
surface modeling literatureRazavi et al. (2012a) conclude for water resources applications single hidden layer ANNs are most popular. ANNs are capable of handling large amounts of training data and it is generally believed that more input data results in a better-generalized model; however, large amounts of data can require additional computational time for training and may trap the train- ing process at a local (rather than global) solution (Zou et al., 2007). Finally, it should be noted that some references, including the MATLAB® Neural Network Toolbox, consider RBFs as a type of feedforward ANN (Razavi et al., 2012a).
later. This one-stage training point selection method can provide a globally stronger surrogate model initially, but the model may not accurately represent the original model in regions of interest. This could lead to failure in both search and sampling applications (Razavi et al., 2012a). In order to avoid poor performance in regions near optimal conditions, Bliznyuk et al. (2008) narrowed the search region by applying optimization techniques directly on the original model, and then fit a surrogate model only in the local optimal region. While this may be beneficial for off-line prob- lems where global accuracy is not required, applying optimization procedures on original simulation models may not be computationally feasible.
Adaptive-Recursive Framework
The adaptive-recursive framework is similar to a basic sequential framework, with the addition of surrogate refinement using a two-stage point selection process. This framework also follows a three step process:
1. Develop a design of experiments in which a predetermined number of samples are taken throughout the feasible space and, in the case of a search analysis, objective function values at each location are evaluated by the original simulation model.
2. A surrogate model is built and parameters are tuned.
3. Identify regions of interest using a search or sampling algorithm, sample additional points in this region using the original simulation model, and repeat Steps 2 and 3 until convergence is reached.
When used for optimization searching, the best point found during the framework process is gen- erally considered the final optimal solution (Razavi et al., 2012a). Zou et al. (2007) employed an adaptive strategy for ANN-enabled optimization of a water quality modeling problem, citing previ- ous linked ANN optimization studies which failed to perform well under off-line sampling. While the adaptive-recursive framework seeks to address the drawbacks of the off-line method, there are cases where this method may fail to find solutions in the true function optimal region (Jones, 2001).
This may result in situations where new sampling points are added in close proximity to preexist- ing training points (thereby adding no additional knowledge for response surface training) or may converge to local optimal solutions.
Metamodel-Embedded Evolution Framework
The metamodel-enabled evolution framework is similar to the adaptive-recursive framework but is designed for use with evolutionary optimization procedures. With this method, an initial sampling plan stemming from a formal design of experiments is not required. Rather, first a population-based optimization algorithm such as a GA is used for several generations, computing function values from the original simulation model. These data points are used to fit a surrogate model. In all subsequent generations, individuals are evaluated by either the surrogate or the original model using a pre-defined process, which has been termed evolution control byJin et al.(2002b). Jin explains that this can be performed two ways: either by designating a certain number of individuals (called controlled individuals) within each generation to be evaluated using the original fitness function, or to introduce controlled generations in which all individuals in that generation are evaluated by the original fitness function. All other individuals are evaluated by the surrogate model. Depending on the approach taken, modelers must decide either the number of controlled individuals or controlled generations; the process can be made further complex by adaptively changing these parameters as the optimization algorithm progresses. The surrogate model is refitted occasionally as training points are added to the set. In order for an optimization process to find global optima under this framework, the evolutionary algorithm chosen must be a global optimizer and any individual in any generation should have some probability of being solved through the original simulation model.
Otherwise, failure modes similar to those occurring in an adaptive-recursive framework are possible (Razavi et al., 2012a). It is also important that the initial collection individuals are well-distributed and approximate the response surface well, as all following generations are conditioned from this set of individuals. If this is not fulfilled, the evolutionary optimization algorithm may fail to find a global solution (Broad et al., 2005).
Approximation Uncertainty-Based Framework
The approximation uncertainty-based framework relies on the basic shell of the adaptive-recursive framework while incorporating surrogate model uncertainty in the sampling decision process. This method has been extensively used in structural (Bichon et al., 2013;S´obester et al., 2005), aerospace (Basudhar et al., 2012;Queipo et al., 2005), manufacturing (Boukouvala and Ierapetritou, 2013;
Chen et al., 2012;Huang et al., 2006), and petroleum engineering (Horowitz et al., 2010;Queipo
et al., 2002) fields, but with the exception of the work ofMugunthan and Shoemaker (2006) and (di Pierro et al., 2009) it has not been well-employed in the water resources arena. While the adaptive-recursive framework assumes surrogate approximate values to be true, this may not be so in many regions of the design space, including at globally optimally regions. This technique re- lies on an approximation uncertainty quantity, which is readily available in certain surrogate forms including kriging and Gaussian RBF models. The three steps involved in this framework are:
1. Develop a design of experiments in which a predetermined number of samples are taken throughout the feasible space and, in the case of a search analysis, objective function values at each location are evaluated by the original simulation model.
2. A surrogate model is built and parameters are tuned.
3. Optimize a new surface function, which balances a desire to minimize model uncertainty and find globally optimal results.
The third step aims to balance exploration and exploitation (Razavi et al., 2012a). Different methods have been developed to perform the third step, but the maximization of an expected improvement function (EIF) approach can be considered the most advanced. An EIF can be used to select training data to be added to the surrogate model of optimization results by calculating the “expectation that any point in the search space will provide a better solution than the current best solution based on the expected values and variances predicted” by the current surrogate model (Bichon et al., 2013).
The EIF at any location x for a kriging metamodel prediction can be expressed as
EI(x) =
f(x∗)−µ
bf(x) Φ
f(x∗)−µbf(x) σbf(x)
! +σ
bf(x)φ f(x∗)−µbf(x) σbf(x)
!
(II.19)
where f(x∗)is the current best function value located atx∗found by the optimization routine,µ
bf(x) is the mean of the kriging prediction atx,σ
bf(x)is the standard deviation of the kriging prediction atx, andΦandφare the standard normal cumulative distribution and probability density functions.
A global optimization routine must be used to determine the maximum of the EIF; the branch-and- bound algorithm (Jones et al., 1998), the DIRECT method (Bichon et al., 2013), and GAs (di Pierro et al., 2009) have been used successfully for this application.
Developed byJones et al.(1998), the efficient global optimization (EGO) algorithm is a commonly-
used optimizer which utilizes an EIF for sampling point search. EGO works well when the function shape and smoothness are generally well-estimated from an initial collection of training points; how- ever, if this is badly approximated due to poorly distributed design sites, the process may converge slowly or prematurely stall (Jones, 2001;Razavi et al., 2012a). EGO will not attempt to add training points identical to those already in the set, but as the optimizer converges there is potential to create an ill-conditioned correlation matrix in the kriging model due to newly-added points being located near previously sampled points in the training set. This can be overcome by using an uncertainty ratio to remove points that are deemed “too close” to other points or employing a “layering” method which “uses separate kriging models for short and long correlation lengths” (Bichon et al., 2013).
The EGO algorithm’s initial formulation is intended for single objective optimization, but it has been extended to perform multiobjective optimization as well. ParEGO (Knowles, 2006) does this by applying weighting factors to aggregate all objectives into a single function, SMS-EGO (Pon- weiser et al., 2008) incorporates multiple surrogates to simulate multiple objectives, and Shinkyu and Obayashi’s multi-EGO procedure embeds a multiobjective GA into an EGO-based framework (Shinkyu and Obayashi, 2005).