• Tidak ada hasil yang ditemukan

Response Surface Surrogate Usage in Water Resources

II.2 Surrogate Modeling Techniques

II.2.4 Response Surface Surrogate Usage in Water Resources

used optimizer which utilizes an EIF for sampling point search. EGO works well when the function shape and smoothness are generally well-estimated from an initial collection of training points; how- ever, if this is badly approximated due to poorly distributed design sites, the process may converge slowly or prematurely stall (Jones, 2001;Razavi et al., 2012a). EGO will not attempt to add training points identical to those already in the set, but as the optimizer converges there is potential to create an ill-conditioned correlation matrix in the kriging model due to newly-added points being located near previously sampled points in the training set. This can be overcome by using an uncertainty ratio to remove points that are deemed “too close” to other points or employing a “layering” method which “uses separate kriging models for short and long correlation lengths” (Bichon et al., 2013).

The EGO algorithm’s initial formulation is intended for single objective optimization, but it has been extended to perform multiobjective optimization as well. ParEGO (Knowles, 2006) does this by applying weighting factors to aggregate all objectives into a single function, SMS-EGO (Pon- weiser et al., 2008) incorporates multiple surrogates to simulate multiple objectives, and Shinkyu and Obayashi’s multi-EGO procedure embeds a multiobjective GA into an EGO-based framework (Shinkyu and Obayashi, 2005).

Surrogates in Automatic Calibration Procedures

The majority of water resources publications using metamodels to aid automatic calibration rou- tines have been designed for watershed models. Liong et al.(2001),Khu and Werner (2003), and Khu et al.(2004) used ANN metamodels in automatic calibration procedures to find optimal param- eter values for the rainfall-runoff models HydroWorks, the Storm Water Management Model, and MIKE 11/NAM, respectively. These procedures use feedforward ANNs to estimate the response of the catchment model, allowing for faster search by GA of the parameter space. In bothLiong et al.

(2001) and Khu and Werner(2003), the ANN metamodel is not fit over a set of uniform training points found from a formal DoE, but rather initial optimization trials are conducted on the original simulation and the evaluated points from this process are used for fitting. Liong et al.(2001) found that a network with three hidden layers which is trained by data from six storm events accurately re- produces the original HydroWorks model in all regions of the parameter space; however, in regions near closely spaced training points a linear interpolation approach performs just as well. Khu and Werner(2003) andKhu et al.(2004) both use a single hidden layer. To avoid overfitting the ANN model, Khu and Werner(2003) employ the early stopping approach; while this procedure results in a savings of 80% of full evaluations, it can limit the number of unique design sites available for training, testing, and validation sets. Additional studies have developed automatic calibration procedures for the SWAT watershed model using various surrogate model forms. Shoemaker et al.

(2007) incorporate RBF models within an evolution framework, screening offspring by estimated fitness predicted by the RBF model and then confirming optimal values with the computationally expensive SWAT model. In comparing the results of the evolutionary algorithm combined with RBF approximation to other calibration methods, they conclude that it is “the most effective algo- rithm when there was a severe limitation on the number of simulations that can be performed” and methods with model approximation “should be seriously considered as alternatives to widely used methods such as SCE [Shuffled Complex Evolution] and evolutionary algorithms without func- tion approximation when the complexity of the simulation model limits the number of simulations that can feasibly be done.” Zhang et al. (2009) approximated the SWAT model by one-hidden- layer ANN and SVM, tested both methods on two watersheds in the eastern United States, and determined that the SVM form resulted in better generalized models than those constructed using ANNs. Razavi et al.(2012b) compared the behavior of two SWAT metamodel-enabled calibration

optimizers, kriging-GA and Multistart Local Metric Stochastic RBF, with two optimizers without metamodeling, dynamically dimensioned search and GA. They concluded kriging-GA and dynami- cally dimensioned search performed similarly in all computational budget settings, with kriging-GA performing slightly better when a harsh limit is placed on the number of allowable function evalua- tions.

Computationally expensive groundwater models can also be calibrated via surrogate-enabled procedures. Rizzo and Dougherty (1994) used a neural kriging network, which consists of both training and spatial interpolation phases, to estimate hydraulic conductivity fields in both two- and three-dimensional aquifer models using limited field data. Johnson and Rogers (2000) tested the accuracy of using linear regression and ANN models for automatic calibration of the 2D finite- difference groundwater model SUTRA, using simulated annealing techniques to search the parame- ter space. The authors included linear approximator tests, which failed to reproduce the high-fidelity model, in their study to avoid “the pitfall of addressing a problem with an unnecessarily complex method,” but acknowledged that from the onset they did not anticipate that they would perform well. Mugunthan et al.(2005) tested two RBF-based function approximation methods (Regis and Shoemaker, 2004; Gutmann, 2001) within various optimization algorithms for autocalibration of chlorinated ethene biodegradation in an aquifer. The original simulation model, DECHLOR, is a multispecies reactive transport model that uses the finite different model MODFLOW for flow com- putations and the reactive transport model RT3D for contaminant transport computations. For their field case study, the original model requires 2.5 hours to complete a single simulation, making it very poorly suited for use directly within an optimization routine. This routine computes objective function values at each evaluation point through the original groundwater model and then fits an RBF surface to aid in optimization search. Both function approximation models performed well, with the model developed byRegis and Shoemaker(2004) performing best for minimizing overall errors in the final calibrated model form.

Automatic calibration routines have also been developed for surface water body models which incorporate surrogate model forms. Zou et al. (2007) demonstrated how an adaptive ANN-GA approach can determine values for 19 calibration parameters which minimize errors in relation to measured values for a eutrophication model (WASP5/EUTRO) linked to a previously calibration CE-QUAL-W2 hydrodynamic model. The 19 calibration parameters were first determined through

a sensitivity analysis, and various ANN models were created to emulate the eutrophication model.

The authors determined that an adaptive ANN-GA procedure (which starts with a limited training set and adaptively adds additional information during optimization) converges closer to the global optimal solution than a one-step ANN-GA process (which starts with a robust training set but no additional training data is added during optimization). The total computational time from training data generation through optimization for this method is about 6.5 days of continuous computation, which largely consists of training data generation and ANN training time. Huang and Liu(2010) performed a similar analysis for calibration of a CE-QUAL-W2 hydrodynamic and WQM, in which 26 calibration parameters were determined by sensitivity analysis in terms of their ability to predict 6 hydrodynamic and water quality outputs (including vertical profile measurements). They also concluded an adaptive procedure performs better than one-step and that the largest computational expense comes from generation of training data through runs of the original high-fidelity model.

Ostfeld and Salomons(2005) also demonstrated a routine for autocalibration of a CE-QUAL-W2 model using a k-nearest neighbors algorithm (kNN) for approximating the error resulting from various parameter combinations. A GA was used for searching. Two application locations were used: a hypothetical reservoir was used to tune the GA-kNN parameters, while a model of the Lower Columbia Slough water body was used to demonstrate autocalibration for temperature and DO prediction. The coupled GA-kNN algorithm produced results similar to those of a pure GA (without model reduction), while reducing computational expense.

Surrogates in Operations and Design Optimization

One of the earliest examples of surrogate-enabled optimization in water resources to minimize computational expense can be found in the work of Alley (1986), which expanded on the work of Gorelick et al. (1984) by creating response functions of computationally expensive contami- nant transport models using polynomial regression. These regressions are functions of pumping- recharge rates at several wells, which form the decision variables of a groundwater contamination concentration minimization optimization problem, and are generated from the results of multiple transport simulation model runs.Lefkoff and Gorelick(1990)’s work expanded on Alley’s by using regression to predict salt mass, rather than concentration, in an irrigated stream-aquifer system in the Arkansas Valley in southeastern Colorado. Although this study did not employ optimization

in the formal sense, the salt transport surrogate results were incorporated into a larger economic- hydrologic-agronomic model which serves as a tool for analyzing the relationship between crop mixing and profit in farming. This linked model system could be further formalized within an optimization routine to determine optimal trade-off points. Cooper et al.(1998) also developed a simulation/regression/optimization model for optimization of the oil recovery process from ground- water, expanding to a non-steady state problem. Response functions for residual oil and free oil were created using outputs from multiple runs of the ARMOS 2D finite element flow simulator, and verification of the surrogate-enabled optimization results by ARMOS simulation show small error levels.

Noting a need to expand these ideas to surface water applications, Ejaz and Peralta (1995) incorporated water quality processes from the QUAL2E simulation model within a simulation- optimization model via simplified regression equations. From the results of numerous systematic QUAL2E simulations, regression equations with a traditional mass balance form best fit all con- stituent response data with the exception of DO, which required a more detailed equation as a func- tion of mass flow rates of BOD5, total nitrogen, and chlorophyll a. A verification step was included following nonlinear optimization to confirm that regression equations predicted acceptably close to QUAL2E.Saad et al. (1996) employed RBF ANNs to decompose the optimal operating policies obtained through dynamic programming for a reservoir system, which were combined to form one equivalent reservoir of equal potential energy. Using historical flow records, 500 equally likely deterministic inflow sequences were generated as inputs, and a year’s optimal operations and corre- sponding potential energy were found for each on a monthly timestep. This formed the data set used for ANN training, and a fuzzy clustering approach was used to compute RBF parameters.Neelakan- tan and Pundarikanthan(1999) also used an ANN for simulation of a reservoir system’s operation as substitution for a conventional simulation model, with the goal of maximizing drinking water supply. The monthly conventional mass-balance simulation model inputs and results were used to train a three-layer feedforward ANN, which was then embedded within a nonlinear optimization algorithm. Training each ANN required 8 hours of computational time, but the ANN model was reported to run 300 times faster than the conventional model. Solving the optimization problem took as long as 15 days of continuous computations using the conventional model, but only a few hours with the ANN model. Castelletti et al.(2010) used response surface methods to optimize the

number and location of water quality rehabilitation devices (i.e., mixers) in order to improve overall water quality in the Googong Reservoir in Australia. The 3-D coupled hydrodynamic-ecological model ELCOM-CAEDYM was used to compute training data for linear interpolators, RBF ANNs, and inverse distance weighting; the authors termed this step as the “learning phase.” Then during the “planning phase,” an approximate solution to the design problem is found. The learning and planning phases are performed iteratively to improve performance near optimal solutions(s), and at each iteration the response surface form with the smallest errors was chosen. Their results showed that significant improvements were possible by simply moving the currently installed mixers and that an additional pair of mixers would further improve destratification. To solve this design op- timization problem using what-if analysis would “require about 5.5 years of computation with a modern computer” according to the authors.