• Tidak ada hasil yang ditemukan

Method performance studies (collaborative trials)

Major topics covered in this chapter

Sampling 75 large improvement in the quality and acceptability of many analytical results. Some

4.12 Method performance studies (collaborative trials)

94 4: The quality of analytical measurements

Method performance studies (collaborative trials) 95 When the seven differences for factors A–G have all been calculated in this way, it is easy to identify any factors that have a worryingly large effect on the results. It may be shown that any difference that is more than twice the standard deviation of replicate measurements is significant and should be further studied. This simple set of experiments, technically known as an incomplete factorial design, has the dis-advantage that interactions between the factors cannot be detected. This point is fur-ther discussed in Chapter 7.

In recent years international bodies have moved towards an agreement on how method performance studies should be conducted. At least eight laboratories should be involved. Since the precision of a method usually depends on the analyte concentration it should be applied to at least five different levels of analyte in the same sample matrix with duplicate measurements (n⫽ 2) at each level. A crucial re-quirement of such a study is that it should distinguish between the repeatability stan-dard deviation, sr, and the reproducibility standard deviation, sR. At each analyte level these are related by the equation:

A

k Ú 8

B

(4.12.2) C = w2max

a

j

w2j

(4.12.1) s2R = s2r + s2L

where j takes values from 1 to k, the number of participating laboratories. The value of C obtained is compared with the critical values in Table A.15, and the null hypothesis, i.e. that the largest variance is not an outlier, is rejected if the critical value at the appropriate value of k is exceeded. When the null hypothesis is rejected, the results from the laboratory in question are discarded.

where is the variance due to inter-laboratory differences, which reflect different degrees of bias in different laboratories. Note that in this particular context, repro-ducibility refers to errors arising in different laboratories and equipment, but using the same analytical method: this is a more restricted definition of reproducibility than that used in other instances. As we saw in Section 4.3, one-way ANOVA can be applied (with separate calculations at each at each concentration level used in the study) to separate the sources of variance in Eq. (4.12.1). However, the proper use of the equa-tion involves two assumpequa-tions: (1) that at each concentraequa-tion level the means obtained in different laboratories are normally distributed; and (2) that at each con-centration the repeatability variance among laboratories is equal. Both these assump-tions are tested using standard methods before the ANOVA calculaassump-tions begin. In practice the second assumption, that of homogeneity of variance, is tested first using Cochran’s test. Strictly speaking this test is designed to detect outlying variances rather than testing for homogeneity of variance as a whole, but other more rigorous methods for the latter purpose are also more complex. Cochran’s test calculates C by comparing the largest range, wmax(i.e. difference between the two results from a single laboratory) with the sum of such ranges, wj, from all the laboratories (if variances rather than ranges are compared, but here we assume that each participating labora-tory makes just two measurements at each level):

n 7 2 sL2

96 4: The quality of analytical measurements

The first assumption is then tested using Grubbs’ test (Section 3.7) which is applied first as a test for single outliers, and then (because each laboratory makes duplicate measurements) in a modified form as a test for paired outliers. In both cases all the results from laboratories producing outlying results are again dropped from the trial unless this would result in the loss of too much data. When these outlier tests are complete, the ANOVA calculation can proceed as in Section 4.3.

In many circumstances it is not possible to carry out a full method performance study as described above, for example when the test materials are not available with a suitable range of analyte concentrations. In such cases a simpler system can be used. This is the Youden matched pairs or two-sample method, in which each par-ticipating laboratory is sent two materials of similar composition, X and Y, and asked to make one determination on each. The results are plotted as shown in Fig. 4.12, each point on the plot representing a pair of results from one laboratory. The mean values for the two materials, and , are also determined, and vertical and horizon-tal lines are drawn through the point ( ), thus dividing the chart into four quad-rants. This plot allows us to assess the occurrence of random errors and bias in the study. If only random errors occur the X and Y determinations may give results which are both too high, both too low, X high and Y low, or X low and Y high. These four outcomes would be equally likely, and the number of points in each of the quadrants would be roughly equal. But if a systematic error occurs in a laboratory, it is likely that its results for both X and Y will be high, or both will be low. So if sys-tematic errors dominate, most of the points will be in the top-right and bottom-left quadrants. This is indeed the result that is obtained in most cases. In the impossible event that random errors were absent, all the results would lie on a line at 45° to the axes of the plot, so when in practice such errors do occur, the perpendicular distance of a point from that line is a measure of the random error of the laboratory repre-sented by that point. Moreover the distance from the intersection of that perpendic-ular with the 45° line to the point ( ) measures the systematic error of the laboratory. This fairly simple approach to a method performance study is thus capa-ble of yielding a good deal of information in a simple form. The Youden approach has the further advantages that participating laboratories are not tempted to censor one or more replicate determinations, and that more materials can be studied with-out large numbers of experiments.

X, Y X, Y Y X

Sample Y

Sample X X, Y

Figure 4.12 A Youden two-sample plot.

Method performance studies (collaborative trials) 97

Evaluate the overall inter-laboratory variation, and its random and systematic components.

In studies of this type there is a difference between the samples as well as the differences between laboratories. In the normal way, such a situation would be dealt with by two-way ANOVA (see Section 7.3), and in some cases this is done.

However, in this instance there are only two samples, deliberately chosen to be similar in their analyte content, so there is little interest in evaluating the dif-ference between them. The calculation can therefore be set out in a numeri-cally and conceptually simpler way than a complete two-way ANOVA. We know that the result obtained by each laboratory for sample X may include a systematic error. The same systematic error will presumably be included in that laboratory’s result for the similar sample Y. The difference D (⫽ X ⫺ Y) will thus have this error removed, so the spread of the D values will provide an es-timate of the random or measurement errors. Similarly, X and Y can be added to give T, the spread of which gives an estimate of the overall variation in the results. The measurement variance is then estimated by:

(4.12.3) and the overall variance, , due to all sources of error, is estimated by:

(4.12.4) Notice that each of these equations includes a 2 in the denominator. This is be-cause D and T each give estimates of errors in two sets of results, subtracted and

s2R = a

i

(Ti - T )2 2(n - 1) s2R

s2r = ai

(Di - D)2 2(n - 1)

Example 4.12.1

The lead levels (in ng g-1) in two similar samples (X and Y) of solid milk formula-tions for infants were determined in nine laboratories (1–9) by graphite-furnace atomic-absorption spectrometry. The results were:

Sample Laboratory

1 2 3 4 5 6 7 8 9

X 35.1 23.0 23.8 25.6 23.7 21.0 23.0 26.5 21.4

Y 33.0 23.2 22.3 24.1 23.6 23.1 21.0 25.6 25.0

Youden plots provide a good deal of information in an immediately accessible form, but we still need methods for calculating the variances and The follow-ing example shows how this can also be done in a simple way.

s2r. s2R

98 4: The quality of analytical measurements

added in D and T respectively. The results of this trial can be expressed in a table as follows:

1 2 3 4 5 6 7 8 9

X 35.1 23.0 23.8 25.6 23.7 21.0 23.0 26.5 21.4

Y 33.0 23.2 22.3 24.1 23.6 23.1 21.0 25.6 25.0

D 2.1 -0.2 1.5 1.5 0.1 -2.1 2.0 0.9 -3.6

T 68.1 46.2 46.1 49.7 47.3 44.1 44.0 52.1 46.4

From the third and fourth rows of the table, and . Equa-tions (4.12.3) and (4.12.4) then give the overall variance and the measurement variance as (5.296)2and (1.383)2respectively. These can be compared as usual using the F-test, giving F = 14.67. The critical value, F8,8, is 3.44 (P ⫽ 0.05), so the inter-laboratory variation cannot simply be accounted for by random errors.

The component due to bias, , is given here by

(4.12.5) Note again the appearance of the 2 in Eq. (4.12.5), because two sample materi-als are studied. Using this equation gives The mean of all the measurements is 49.33/2 ⫽ 24.665, so the relative standard deviation is (100 ⫻ 5.296)/24.665 ⫽ 21.47%. This seems to be a high value, but the Horwitz trumpet relationship would predict an even higher value of ca. 28% at this con-centration level. It should be noted that possible outliers are not considered in the Youden procedure, so the question of whether we should reject the high re-sults from laboratory 1 does not arise.

s2L = 3.6152. s2R = 2s2L + s2r

s2L

T = 49.33 D = 0.244