Kinetics and Ideal Reactor Models
3.5 NOTES ON METHODOLOGY FOR PARAMETER ESTIMATION
Generally, the primary objective of parameter estimation is to generate estimates of rate parameters that accurately predict the experimental data. Therefore, once es- timates of the parameters are obtained, it is essential that these parameters be used to predict (recalculate) the experimental data. Comparison of the predicted and experi- mental data (whether in graphical or tabular form) allows the “goodness of fit” to be assessed. Furthermore, it is a general premise that differences between predicted and experimental concentrations be randomly distributed. If the differences do not appear to be random, it suggests that the assumed rate law is incorrect, or that some other feature of the system has been overlooked.
At this stage, we consider a reaction of the form of (A) in section 3.1.2:
lvAIA + /vulB + IV& + products (A) and that the rate law is of the form of equations 3.1-2 and 3.1-8 combined:
(-rA) = kAcic[. . . = Aexp(-E,lRT)czcg . . .
(In subsequent chapters, we may have to consider forms other than this straightfor- ward power-law form; the effects of T and composition may not be separable, and, for complex systems, two or more rate laws are simultaneously involved. Nevertheless, the same general approaches described here apply.)
Equation 3.4-17 includes three (or more) rate parameters in the first part: kA, a, j?, . ..)and four (or more) in the second part: A, EA, (Y, p, . . . . The former applies to data obtained at one T, and the latter to data obtained at more than one T. We assume that none of these parameters is known a priori.
In general, parameter estimation by statistical methods from experimental data in which the number of measurements exceeds the number of parameters falls into one of two categories, depending on whether the function to be fitted to the data is linear or nonlinear with respect to the parameters. A function is linear with respect to the param- eters, if for, say, a doubling of the values of all the parameters, the value of the function doubles; otherwise, it is nonlinear. The right side of equation 3.4-17 is nonlinear. We can put it into linear form by taking logarithms of both sides, as in equation 3.4-4:
ln(-rA) = lnA-(E,/RT)+aclnc,+Plncu+...
(3.4-18)
The function is now ln(-rA), and the parameters are In A, EA, a, p, . . . .
Statistical methods can be applied to obtain values of parameters in both linear and nonlinear forms (i.e., by linear and nonlinear regression, respectively). Linearity with respect to the parameters should be distinguished from, and need not necessarily be associated with, linearity with respect to the variables:
(1)
In equation 3.4-17, the right side is nonlinear with respect to both the parameters (A, EA, (Y, p, . . .) and the variables (T, CA, cB, . . .).(2) In equation 3.4-18, the right side is linear with respect to both the parameters and the variables, if the variables are interpreted as l/T, ln CA, ln cn, . . . . However, the transformation of the function from a nonlinear to a linear form may result in a poorer fit. For example, in the Arrhenius equation, it is usually better to esti- mate A and EA by nonlinear regression applied to k = A exp( -E,/RT), equation 3.1-8, than by linear regression applied to Ink = In A - E,IRT, equation 3.1-7.
This is because the linearization is statistically valid only if the experimental data are subject to constant relative errors (i.e., measurements are subject to fixed percentage errors); if, as is more often the case, constant absolute errors are ob- served, linearization misrepresents the error distribution, and leads to incorrect parameter estimates.
3.5 Notes on Methodology for Parameter Estimation 59
v0“OF
SOLUTION
v0“O-v
(3) The function y = a + bx + cx2 + dx3 is linear with respect to the parameters a, b, c, d (which may be determined by linear regression), but not with respect to the variable x .
The reaction orders obtained from nonlinear analysis are usually nonintegers. It is customary to round the values to nearest integers, half-integers, tenths of integers, etc.
as may be appropriate. The regression is then repeated with order(s) specified to obtain a revised value of the rate constant, or revised values of the Arrhenius parameters.
A number of statistics and spreadsheet software packages are available for linear re- gression, and also for nonlinear regression of algebraic expressions (e.g., the Arrhenius equation). However, few software packages are designed for parameter estimation in- volving numerical integration of a differential equation containing the parameters (e.g., equation 3.4-8). The E-Z Solve software is one package that can carry out this more dif- ficult type of nonlinear regression.
Estimate the rate constant for the reaction A + products, given the following data for reaction in a constant-volume BR:
tlarb. units 0 1 2 3 4 6 8
c,/arb. units 1 0.95 0.91 0.87 0.83 0.76 0.72 Assume that the reaction follows either first-order or second-order kinetics.
This problem may be solved by linear regression using equations 3.4-11 (n = 1) and 3.4-9 (with n = 2), which correspond to the relationships developed for first-order and second- order kinetics, respectively. However, here we illustrate the use of nonlinear regression applied directly to the differential equation 3.4-8 so as to avoid use of particular linearized integrated forms. The method employs user-defined functions within the E-Z Solve soft- ware. The rate constants estimated for the first-order and second-order cases are 0.0441 and 0.0504 (in appropriate units), respectively (file ex3-8.msp shows how this is done in E-Z Solve). As indicated in Figure 3.9, there is little difference between the experimental data and the predictions from either the first- or second-order rate expression. This lack of sensitivity to reaction order is common when fA < 0.5 (here, fA = 0.28).
Although we cannot clearly determine the reaction order from Figure 3.9, we can gain some insight from a residual plot, which depicts the difference between the predicted and experimental values of cA using the rate constants calculated from the regression analysis. Figure 3.10 shows a random distribution of residuals for a second-order re- action, but a nonrandom distribution of residuals for a first-order reaction (consistent overprediction of concentration for the first five datapoints). Consequently, based upon this analysis, it is apparent that the reaction is second-order rather than first-order, and the reaction rate constant is 0.050. Furthermore, the sum of squared residuals is much smaller for second-order kinetics than for first-order kinetics (1.28 X 10V4 versus 5.39 x 10-4).
We summarize some guidelines for choice of regression method in the chart in Figure 3.11. The initial focus is on the type of reactor used to generate the experimental data
- 1st order - - - 2nd order
2 4 6
Time, arbitrary units
8i Figure 3.9 Comparison of first- and second-order fits of data in Example 3-8
(for a simple system and rate law considered in this section). Then the choice de- pends on determining whether the expression being fitted is linear or nonlinear (with respect to the parameters), and, in the case of a BR or integral PFR, on whether an analytical solution to the differential equation involved is available. The equa- tions cited by number are in some cases only representative of the type of equation encountered.
In Figure 3.11, we exclude the use of differential methods with a BR, as described in Section 3.4.1.1.1. This is because such methods require differentiation of experimental ci(t) data, either graphically or numerically, and differentiation, as opposed to integra- tion, of data can magnify the errors.
O 2 4 6 8
Time, arbitrary units
Figure 3.10 Comparison of residual values, CA&c - cA,eIP for first- and second-order fits of data in Example 3-8