Calibration analysis: nominal case - Bayesian model calibration: QASPR simulation

6.3 Bayesian model calibration: QASPR simulation

6.3.2 Calibration analysis: nominal case

The “nominal case” that is discussed here is defined as the calibration analysis that considers only the data for the Q1 scenario and response 1, such that n = 1. The first step is to use the 300 observed simulator runs to construct a Gaussian process approximation that relates the calibration parameters, θ, to response 1. The resulting maximum likelihood estimates of the normalized correlation lengths indicate that the simulation is probably most sensitive to inputs 6, 11, 2, and 8 (in that order).

To illustrate the form of the response, several input/output plots based on the Gaussian

process model are constructed. These plots display response 1 versus inputs 6 and 11 (the 2 most important inputs) and response 1 versus inputs 2 and 8 (the next 2 most important inputs).

In each case, the values of the 9 other inputs are held constant.² Figure 6.18 plots response 1 as a function of inputs 6 and 11 as both a mesh plot and a contour plot. The contour plot helps to illustrate the particular region of these inputs that matches well with the experimental observation (which observation is 0.41 for this case). A mesh/contour plot ofσGP is also given to illustrate how the response surface approximation uncertainty varies in this domain. The corresponding 3 plots are also given for inputs 2 and 8 in Figure 6.19.

After specifying the GP response surface model, 25,000 MCMC samples are used to construct the posterior distribution of the calibration inputs. The marginal posterior distributions for the two most important inputs (6 and 11) are shown in Figures 6.20 and 6.21.³ The marginal posteriors for inputs 6 and 11 both suggest that the upper bounds for these variables should possibly be increased in the future. The remaining ten inputs show less deviation from their marginal prior distributions, and five of the inputs show almost no change from their priors (which means that marginally, all values within the respective ranges are equally effective at yielding a response consistent with the observation).

Two of the largest correlations between the updated inputs are between inputs 2 and 6, and between 5 and 11. Multivariate kernel density estimation (Section 4.2.2) is used to display contour plots of the two bivariate marginal densities in Figures 6.22 and 6.23. Using Spear- man’sρ, a non-parametric correlation measure, the correlations are −0.26and−0.29, which seem fairly mild. However, as more variables are considered together, the correlation struc-

2The particular values of the non-varying inputs are chosen based on the results of the calibration analysis itself. The values used are the estimated joint mode of the inputs, based on the observation of response 1. This means that the response plots will be relevant to the posterior distributions of the inputs.

3To display the marginal posteriors, a beta distribution is fit to the posterior MCMC samples. The beta distribution is suitable because the variables have an upper and lower bound, and because the distributions are unimodal.

(a) Response based on Gaussian process model (µ_GP)

(b) Contour plot of response based on Gaussian process model (µ_GP)

Figure 6.18: Gaussian process approximation to response 1 based on inputs 6 and 11

(a) Response based on Gaussian process model (µ_GP)

(b) Contour plot of response based on Gaussian process model (µ_GP)

Figure 6.19: Gaussian process approximation to response 1 based on inputs 2 and 8

0 0.5 1 1.5 2 2.5 3

-15 -14.5 -14 -13.5 -13

Probability Density

Input 6 Posterior

Prior

Figure 6.20: Marginal prior and posterior distributions for input 6 based on nominal calibration analysis

0 1 2 3 4 5 6

0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5

Probability Density

Input 11 Posterior

Prior

Figure 6.21: Marginal prior and posterior distributions for input 11 based on nominal calibration analysis

ture can only become more complicated, further reinforcing the fact that it is dangerous to assume the updated probability distributions to be independent of each other (as discussed in Section 6.3.4).

0.1 0.2 0.3 0.4 0.5

−15.0

−14.5

−14.0

−13.5

−13.0

Input 6

Input 2

Figure 6.22: Estimated joint density of inputs 2 and 6

As a check on the calibration analysis, the posterior predictive distribution is compared to the experimental observation. The posterior predictive distribution is simply obtained by propagating the posterior distribution of θ through the Gaussian process approximation to the simulator (recall that access to the simulator itself is not available). This comparison is illustrated below in Figure 6.24. The experimental observation is represented via a normal

−14.5 −14.0 −13.5 −13.0 −12.5 0.1

0.2 0.3 0.4 0.5

Input 11

Input 5

Figure 6.23: Estimated joint density of inputs 5 and 11

distribution with variance σ².⁴ In addition, Figure 6.24 also shows the empirical probability density of the response value corresponding to the original 300 model runs. This distribution is included as a point of reference, and can be used to gauge the improvement associated with the calibrated parameter estimates. It is clear that the calibration analysis has resulted in model predictions (albeit, predictions based on the response surface approximation model) that agree well with the observation, particularly in comparison to the original simulator runs.

Since an independently uniform prior has been used forθ, it is expected that posterior predictive distribution will be proportional to the experimental uncertainty distribution (assuming the calibration analysis is successful in matching the predictions with the observation). This would be the case, except for the fact that the uncertainty in the response surface approximation is included in the Bayesian updating (as in Eq. (5.14)). The additional uncertainty added by the response surface approximation causes the variance of the posterior to be greater than the variance/uncertainty of the experiments. In addition, the posterior is “pulled” very slightly away from the datum towards the area where there is less response surface approximation un-

4Recall from Section 5.3 that σ² can represent both error/uncertainty in the experimental observations and error/uncertainty in the model output. To simplify the visualization here, σ² is considered to be uncertainty associated with the experiment.

0 2 4 6 8 10 12

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8

Probability Density

Response 1

Original model runs Experimental Uncertainty Posterior

Figure 6.24: Distribution of original model runs, experimental uncertainty, and posterior predictive distribution for the nominal calibration analysis

certainty. Since most of the original model runs correspond to values of the response that are less than the observation, the posterior shifts slightly away from the datum in this direction.

Dalam dokumen uncertainty analysis for computer simulations - CORE (Halaman 173-179)