Major topics covered in this chapter
Uncertainty 99 to each, and then combines these components using the rules summarised in
5.9 Use of regression lines for comparing analytical methods
130 5: Calibration methods in instrumental analysis: regression and correlation
Although it is an elegant approach to the common problem of matrix interference effects, the method of standard additions has a number of disadvantages. The principal one is that each test sample requires its own calibration graph, in contrast to conven-tional calibration experiments, where one graph can provide concentration values for many test samples. As well as requiring more effort for this reason, the standard addi-tions method may also use larger quantities of sample than other methods. Moreover the interference in the measurements may be of a different type, i.e. when another component of the matrix affects the analyte signal by a fixed amount at all analyte con-centrations. These so-called translational or baseline interferences (which may occur alongside other rotational interferences) do not affect the slope of a calibration graph, but simply shift the whole graph in the y-direction. In such cases the standard addi-tions approach described above provides no correction (though more advanced meth-ods do). Baseline interferences must usually be corrected quite separately, for example by comparing the results with those obtained using an entirely different method.
In statistical terms the standard additions approach is an extrapolation method, and although it is difficult to formulate a model that would allow exact comparisons it seems that it should in principle be less precise than interpolation techniques. In practice, the loss of precision is not very serious.
Use of regression lines for comparing analytical methods 131
(a) (b)
(c) (d)
(e) (f)
Method B
Method A
Figure 5.10 Use of a regression line to compare two analytical methods: (a) shows perfect agreement between the two methods for all the samples; (b)–(f) illustrate the results of various types of systematic error (see text).
non-zero intercept. That is, one method of analysis may yield a result higher or lower than the other by a fixed amount. Such an error might occur if the background signal for one of the methods was wrongly calculated (Fig. 5.10b). A second possibility is that the slope of the regression line is 1 or 1, indicating that a systematic error may be occurring in the slope of one of the individual calibration plots (Fig. 5.10c). These two errors may occur simultaneously (Fig. 5.10d). Further possible types of systematic error are revealed if the plot is curved (Fig. 5.10e). Speciation problems may give surprising results (Fig. 5.10f). This last type of plot might arise if an analyte occurred in two chemically distinct forms, the proportions of which varied from sample to sample. One of the methods under study (here plotted on the y-axis) might detect only one form of the analyte, while the second method detected both forms.
In practice, the analyst most commonly wishes to test for an intercept differing significantly from zero, and a slope differing significantly from 1. Such tests are per-formed by determining the confidence limits for a and b, generally at the 95% sig-nificance level. The calculation is very similar to that described in Section 5.5, and is most simply performed by using a program such as Excel®. This spreadsheet is applied to the following example.
Example 5.9.1
The level of phytic acid in 20 urine samples was determined by a new cata-lytic fluorimetric (CF) method, and the results were compared with those
132 5: Calibration methods in instrumental analysis: regression and correlation
Sample number CF result EP result
1 1.87 1.98
2 2.20 2.31
3 3.15 3.29
4 3.42 3.56
5 1.10 1.23
6 1.41 1.57
7 1.84 2.05
8 0.68 0.66
9 0.27 0.31
10 2.80 2.82
11 0.14 0.13
12 3.20 3.15
13 2.70 2.72
14 2.43 2.31
15 1.78 1.92
16 1.53 1.56
17 0.84 0.94
18 2.21 2.27
19 3.10 3.17
20 2.34 2.36
obtained using an established extraction photometric (EP) technique. The following data were obtained (all the results, in mg l-1, are means of tripli-cate measurements).
(March, J.G., Simonet, B.M. and Grases, F., 1999, Analyst, 124: 897)
This set of data shows why it is inappropriate to use the paired t-test, which evaluates the differences between the pairs of results, in such cases (Section 3.3).
The range of phytic acid concentrations (ca. 0.14–3.50 mg l-1) in the urine samples is so large that a fixed discrepancy between the two methods will be of varying significance at different concentrations. Thus a difference between the two techniques of 0.05 mg l-1would not be of great concern at a level of ca.
3.50 mg l-1, but would be more disturbing at the lower end of the concentra-tion range.
Table 5.1 shows the summary output of the Excel®spreadsheet used to calculate the regression line for the above data. The CF data have been plotted on the y-axis, and the EP results on the x-axis (see below). The output shows that the r-value (called ‘Multiple R’ by this program because of its potential application to multiple regression methods) is 0.9966. The intercept is 0.0497, with upper and lower confidence limits of 0.1399 and 0.0404: this range includes the ideal value of zero. The slope of the graph, called ‘X variable 1’ because b is the coefficient of the x-term in Eq. (5.2.1), is 0.9924, with a 95% confidence inter-val of 0.9521–1.0328: again this range includes the model inter-value, in this case 1.0. (The remaining output data are not needed in this example, and are dis-cussed further in Section 5.11.) Figure 5.11 shows the regression line with the characteristics summarised above.
Use of regression lines for comparing analytical methods 133
0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0
0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 EP results
CF results
Figure 5.11 Comparison of two analytical methods: data from Example 5.9.1.
Table 5.1 Excel output for Example 5.9.1 Regression statistics
Multiple R 0.9966
R square 0.9933
Adjusted R square 0.9929 Standard error 0.0829
Observations 20
ANOVA
df SS MS F Significance F
Regression 1 18.341 18.341 2670.439 5.02965E-21
Residual 18 0.124 0.007
Total 19 18.465
Coefficients Standard t stat P-value
error
Intercept -0.0497 0.0429 -1.158 0.262
X variable 1 0.9924 0.0192 51.676 5.03E-21
Lower 95% Upper 95%
Intercept -0.1399 0.0404
X variable 1 0.9521 1.0328
Two further points are important in connection with this example. First, the litera-ture of analytical chemistry shows that authors frequently place great stress on the value of the correlation coefficient in such comparative studies. In the above exam-ple, however, this coefficient played no direct role in establishing whether or not
134 5: Calibration methods in instrumental analysis: regression and correlation
systematic errors had occurred. Even if the regression line had been slightly curved, the correlation coefficient might still have been close to 1 (cf. Section 5.3 above).
This means that the calculation of r is less important in the present context than the establishment of confidence limits for the slope and the intercept. In some cases it may be found that the r-value is not very close to 1, even though the slope and the intercept are not significantly different from 1 and 0 respectively. Such a result would suggest very poor precision for either one or both of the methods under study. The precisions of the two methods can be determined and compared using the methods of Chapters 2 and 3. In practice it is desirable that this should be done before the regression line comparing the methods is plotted – the reason for this is explained below. The second point to note is that it is desirable to compare the methods over the full range of concentrations, as in the example given where the urine samples examined contained phytic acid concentrations that covered the range of interest fairly uniformly.
Although very widely adopted in comparative studies of instrumental methods, the approach described here is open to some theoretical objections. First, as has been emphasised throughout this chapter, the line of regression of y on x is calculated on the assumption that the errors in the x-values are negligible – all errors are assumed to occur in the y-direction. While generally valid in a calibration plot for a single an-alyte, this assumption is evidently not justified when the regression line is used for comparison purposes: it is certain that random errors will occur in both analytical methods, i.e. in both the x- and y-directions. Thus the equations used to calculate the regression line are not really valid for this application. However the regression method is still widely used, as the graphs obtained provide valuable information on the nature of any differences between the methods (Fig. 5.10). Simulations show, moreover, that the approach does give surprisingly acceptable results, provided that the more precise method is plotted on the x-axis (this is why we investigate the precisions of the two methods – see above), and that a reasonable number of points (ca. ten at least) uniformly covering the concentration range of interest is used. Since the confidence limit calculations are based on (n – 2) degrees of freedom, it is partic-ularly important to avoid small values of n.
A second objection to using the line of regression of y on x, as calculated in Sec-tions 5.4 and 5.5, in the comparison of two analytical methods is that it also assumes that the error in the y-values is constant. Such data are said to be homoscedastic. As previously noted, this means that all the points have equal weight when the slope and intercept of the line are calculated. This assumption is obviously likely to be invalid in practice. In many analyses, the data are heteroscedastic, i.e. the standard deviation of the y-values increases with the concentration of the analyte, rather than having the same value at all concentrations (see below). This objection to the use of unweighted regression lines also applies to calibration plots for a single analytical procedure. In principle weighted regression lines should be used instead, as shown in the next section.
Both these problems are overcome by the use of a technique known as functional relationship estimation by maximum likelihood (FREML). This calculation can be applied both to the comparison of analytical methods as discussed in this section, and to conventional calibration graphs in instrumental analyses if the assumption that the x-direction errors are negligible compared with the y-direction ones is not justifiable. The latter situation can arise in at least two ways. If solid reference ma-terials are used for the calibration standards, they are quite likely to be heterogeneous,
Weighted regression lines 135 so x-direction errors may be significant. In other cases, the opposite situation occurs, i.e.
the y-direction errors are so small that they become comparable with the x-direction ones. Such data may occur in the use of highly automated flow analysis methods, such as flow injection analysis or high-performance liquid chromatography. The FREML method provides for both y- and x-direction errors. It assumes that the errors in the two directions are normally distributed, but that they may have unequal variances, and it minimises the sums of the squares of both the x- and y-residuals, divided by the corresponding variances. This result cannot be obtained by the simple calculation used when only y-direction errors are considered, but requires an iterative method, which can be implemented using, for example, a macro for Minitab®(see Bibliography).
The method is reversible (i.e. in a method comparison it does not matter which method is plotted on the x-axis and which on the y-axis) and can also be used in weighted regression calculations (see Section 5.10).