4.3 Model Calibration
4.3.8 Model Calibration under Uncertainty
The calibration problem discussion in Sections 4.3.1 - 4.3.5 considered cases when point-valued, paired input-output data are available for calibration. Consider the basic problem with the model y = G(x;θ); this basic problem can be expanded to include different features, as explained below. These different features include situations where different sources of uncertainty may be present in the model and/or data. It is expected that the uncertainty in the model parameters increase with the presence of additional sources of uncertainty. The goal is to quantify this uncertainty in the model parameters. From hereon, the concept of likelihood and the Bayesian
approach are pursued rigorously, since the least squares method can neither rigorously account for the various sources of uncertainty nor calculate the PDF of the model parameters.
4.3.8.1 Additional Sources of Uncertainty
The model prediction may also depend on some other quantities α which are known to be uncertain and cannot be measured while collecting calibration data.
Further, such quantities αare not calibrated because it may not be physically mean- ingful to calibrate them.
Hence, the model is represented as y = G(x;θ,α), and the uncertainty in α is denoted in terms of its PDFf(α). Similar to Section 4.3.1, point-valued input-output data (xi vs. yi; i= 1 to n) are assumed to be available to calibrate θ; however, now the difference is that the model prediction at inputxi is uncertain. Further, the PDF of the model prediction at xi is not statistically independent of that at xj (i 6= j) because the same PDF f(α) is used for both cases, even though the measurements (xi vs. yi) are independent of each other.
The likelihood function of θ can be constructed to include the uncertainty inα, as:
L(θ) = Z
L(θ,α)f(α)dα (4.15)
The likelihood L(θ) is used in Bayesian inference to compute the posterior PDF of θ. In Eq. 4.15, the likelihood functionL(θ,α) is calculated as:
L(θ,α)∝
n
Y
i=1
"
1 σp
(2π) exp−
(yi−G(xi,θ,α))2 2σ2
#
(4.16)
In Eq. 4.16, the likelihood is calculated only for a particular value of α and hence
the independence between the measurements can be used to multiply individual like- lihoods.
Note that in Eq. 4.15, the calculation of likelihood is a multi-dimensional integra- tion, where the number of dimensions is equal to the number of uncertain quantities in α. When this likelihood is substituted in Bayesian, the calculation of posterior involves multi-dimensional integration, where the number of dimensions is equal to the number of calibration parameters in θ. Hence, this requires a nested multi- dimensional integration. This issue of the presence of additional sources of uncertainty is also discussed in detail, later in Chapter IX.
4.3.8.2 Interval Data for Calibration
Consider the calibration problem with the model y = G(x;θ). Sometimes, the data for calibration is available in the form of intervals. For the sake of illustration, considermintervals, [ai,bi] at the input level and corresponding [ci,di] at the output level. How to construct the likelihood for this case? Censored data, often available in reliability analysis [79], is a special case of interval data. Suppose that the number of cycles to failure is measured in reliability testing; if the specimen does not fail until N cycles, then the number of cycles to failure is a censored interval, i.e. (N, ∞).
The likelihood-based approach for representation of interval data (developed ear- lier in Section 3.4) cannot be applied here because, if all the intervals were represented using a combined PDF, then the “orderedness” or “correspondence” between the in- put and output pairs would be lost. Hence, each interval has to be treated separately.
Each interval is represented using a uniform distribution on the interval [ai, bi] and the corresponding PDF is denoted as f(χi) (i = 1 to m; ai ≤ χi ≤ bi). Note that ai, bi, and χi are vectors; each member of this vector corresponds to a member in the input vector xi.
These PDFs can be used to construct the likelihood function for θ, in terms of the individual likelihoods, as:
L(θ)∝
m
Y
i=1
Li(θ) (4.17)
where the individual likelihood Li(θ) can be calculated by including the PDF f(χi) as:
Li(θ) = Z
Li(χi,θ)f(χi)dχi (4.18) The likelihood Li(χi,θ) in Eq. 4.18 is calculated for one realization of the input χi, as:
Li(χi,θ)∝
Z y=dj
y=cj
1 σp
(2π) exp−(y−G(χi,θ)2 2σ2
dy (4.19)
Note that Eq. 4.19 uses a CDF to account for the interval data as against the PDF in Eq. 4.6. This aspect is similar to the treatment of interval data in Section 3.4.1.
4.3.8.3 Partially Characterized Data for Calibration
Consider the calibration problem with the model y = G(x;θ). Typically, in an experiment, the value of the independent variable (input) is selected, the experiment is performed, and the corresponding measurement of the dependent variable (output) is used for calibration; such measurements are well-characterized. Sometimes, it may not be possible to conduct experiments in such a way that the input and the output measurements have one-to-one correspondence. In other words, the input measure- ments are conducted independent of the output measurements; such measurements are referred to be “partially characterized” or “uncharacterized” in this dissertation.
Further, each of the measurements (input and/or output) may be point-valued or an interval. How to construct the likelihood for this case?
Consider m point data xi (i = 1 tom) and n intervals [ai,bi] (i = 1 to n), avail- able for a particular input x; note the vector of inputs is not considered here. Since
there is no one-to-one correspondence between the input and output measurements, all of the input measurements can be aggregated.
From a frequentist point of view, one possible approach is to construct a composite PDF, as:
fX(x) = 1 m+n(
m
X
i=1
δ(x−xi) +
n
X
i=1
UX(ai, bi)) (4.20) In Eq. 4.20, δ(.) refers to the Dirac delta function, and UX(ai, bi) refers to the PDF of a uniform distribution defined on the interval [ai,bi], as shown in Eq. 4.21.
UX(ai, bi) =
bi−1ai if ai ≤x≤bi
0 else
(4.21)
Thus, each point data is represented as a Dirac delta function, and each interval is represented using a uniform distribution. The input PDF fX(x) is expressed as a weighted sum of all these distributions, where each weight is equal to 1
m+n, assuming that each data (point or interval) is weighed equally.
Alternatively, from a subjectivist point of view, the methods for data uncertainty quantification developed in Chapter III can be used to construct the PDF fX(x) for the input x; the parametric methods in Sections 3.4 and 3.6 or the non-parametric method in Section 3.7 can be used for this purpose.
The above procedure for the calculation of fX(x) is repeated for all the input variables which are uncharacterized, and the joint PDF of the inputs is denoted as fX(x).
This PDF can be used in uncertainty propagation to compute the model prediction as a function of the parameterθ, using uncertainty propagation methods discussed in Section 2.5. Let fY(y|θ) denote the corresponding model prediction; note that this is
computed as a function of θ, in order to facilitate the construction of the likelihood function L(θ). This likelihood is constructed using the output data available.
At the output level, consider p point data yi (i = 1 to p) and q intervals [ci, di] (i = 1 toq). Similar to the previous sections, the likelihood is calculated using the PDF value for point data and CDF values for interval data as:
L(θ)∝
" p Y
i=1
Z
f(z =yi|y)fY(y|θ)dy
#
×
" q Y
j=1
Z Z z=di
z=ci
f(z|y)dz
fY(y|θ)dy
# (4.22)
In Eq. 4.22,z is simply used as a dummy variable, andf(z|y) is calculated similar to Eq. 4.6, as:
f(z|y) = 1 σp
(2π) exp−
(z−y)2 2σ2
(4.23) As in the previous sections, the likelihood function can be used in Bayesian infer- ence in order to compute the posterior PDF of θ.
4.3.8.4 Calibration under Uncertainty: Synopsis
Conventionally, model calibration has considered paired input-output measure- ments for calibration. In this dissertation, several scenarios for model calibration are considered:
1. Additional sources of uncertainty 2. Interval data for calibration
3. Uncharacterized input-output data
The methods proposed to address the above situations can be easily extended to address situations where unpaired data, interval data, and other sources of uncertainty
are all simultaneously present. A numerical example is presented in Section 4.3.10 to illustrate the proposed methods. Further, the topic of model calibration under uncertainty will again be revisited in Chapter IX.