The Latin Square Design - Cochran’s Theorem

Cochran’s Theorem

4.2 The Latin Square Design

which is also a single-factor analysis. Again, by analogy with Equation 3.5, the reduction in the sum of squares for ﬁtting this model is

withadegrees of freedom. The sum of squares for blocks {j} after ﬁtting and {i} is

withb1 degrees of freedom, which we have given previously as Equation 4.11.

We have developed the sums of squares for treatments, blocks, and error in the randomized complete block design using the general regression signiﬁcance test. Although we would not ordinarily use the general regression signiﬁcance test to actually analyze data in a randomized complete block, the procedure occasionally proves useful in more general randomized block designs, such as those discussed in Section 4.4.

Exact Analysis of the Missing Value Problem. In Section 4.1.3 an approximate procedure for dealing with missing observations in the RCBD was presented. This approximate analysis consists of estimating the missing value so that the error mean square is minimized. It can be shown that the approximate analysis produces a biased mean square for treatments in the sense that E(MS_Treatments) is larger than E(MS_E) if the null hypothesis is true. Consequently, too many significant results are reported.

The missing value problem may be analyzed exactly by using the general regression signiﬁcance test. The missing value causes the design to be unbalanced, and because all the treatments do not occur in all blocks, we say that the treatments and blocks are not orthog- onal. This method of analysis is also used in more general types of randomized block designs; it is discussed further in Section 4.4. Many computer packages will perform this analysis.

We see that both batches of raw material (rows) and operators (columns) are orthogonal to treatments.

The Latin square design is used to eliminate two nuisance sources of variability; that is, it systematically allows blocking in two directions. Thus, the rows and columns actually representtwo restrictions on randomization. In general, a Latin square for pfactors, or a ppLatin square, is a square containing prows and pcolumns. Each of the resulting p²cells contains one of the pletters that corresponds to the treatments, and each letter occurs once and only once in each row and column. Some examples of Latin squares are

44 55 66

A B D C A D B E C A D C E B F B C A D D A C B E B A E C F D C D B A C B E D A C E D F A B D A C B B E A C D D C F B E A E C D A B F B A D C E E F B A D C

Latin squares are closely related to a popular puzzle called a sudoku puzzle that origi- nated in Japan (sudoku means “single number” in Japanese). The puzzle typically consists of a 99 grid, with nine additional 33 blocks contained within. A few of the spaces contain numbers and the others are blank. The goal is to ﬁll the blanks with the integers from 1 to 9 so that each row, each column, and each of the nine 33 blocks making up the grid contains just one of each of the nine integers. The additional constraint that a standard 99 sudoku puzzle have 33 blocks that also contain each of the nine integers reduces the large number of possible 99 Latin squares to a smaller but still quite large number, approximately 610²¹.

Depending on the number of clues and the size of the grid, sudoku puzzles can be extremely difﬁcult to solve. Solving an nnsudoku puzzle belongs to a class of computational problems called NP-complete (the NPrefers to non-polynomial computing time). An NP-complete problem is one for which it’s relatively easy to check whether a particular answer is correct but may require an impossibly long time to solve by any simple algorithm as ngets larger.

Solving a sudoku puzzle is also equivalent to “coloring” a graph—an array of points (vertices) and lines (edges) in a particular way. In this case, the graph has 81 vertices, one for each cell of the grid. Depending on the puzzle, only certain pairs of vertices are joined by an edge. Given that some vertices have already been assigned a “color” (chosen from the nine number possibilities), the problem is to “color” the remaining vertices so that any two vertices joined by an edge don’t have the same “color.”

4.2 The Latin Square Design

159

■ T A B L E 4 . 9

Latin Square Design for the Rocket Propellant Problem Operators Batches of

Raw Material 1 2 3 4 5

1 A24 B20 C19 D24 E24

2 B17 C24 D30 E27 A36

3 C18 D38 E26 A27 B21

4 D26 E31 A26 B23 C22

5 E22 A30 B20 C29 D31

Thestatistical modelfor a Latin square is

(4.27)

whereyijkis the observation in the ith row and kth column for the jth treatment,is the over- all mean, iis the ith row effect,jis the jth treatment effect,kis the kth column effect, and ijkis the random error. Note that this is an effects model. The model is completely additive;

that is, there is no interaction between rows, columns, and treatments. Because there is only one observation in each cell, only two of the three subscripts i,j, and kare needed to denote a particular observation. For example, referring to the rocket propellant problem in Table 4.8, ifi2 and k3, we automatically ﬁnd j4 (formulation D), and if i1 and j3 (for- mulationC), we ﬁnd k3. This is a consequence of each treatment appearing exactly once in each row and column.

The analysis of variance consists of partitioning the total sum of squares of the Np² observations into components for rows, columns, treatments, and error, for example,

(4.28) with respective degrees of freedom

Under the usual assumption that ijkis NID (0,²), each sum of squares on the right-hand side of Equation 4.28 is, upon division by ², an independently distributed chi-square random vari- able. The appropriate statistic for testing for no differences in treatment means is

which is distributed as Fp1,(p2)(p1)under the null hypothesis. We may also test for no row effect and no column effect by forming the ratio of MSRowsorMSColumnstoMSE. However, because the rows and columns represent restrictions on randomization, these tests may not be appropriate.

The computational procedure for the ANOVA in terms of treatment, row, and column totals is shown in Table 4.10. From the computational formulas for the sums of squares, we see that the analysis is a simple extension of the RCBD, with the sum of squares resulting from rows obtained from the row totals.

F₀MSTreatments

MSE

p²1p1p1p1(p2)(p1) SSTSSRowsSSColumnsSSTreatmentsSSE

yijk ijkijk

ⁱ1, 2, . . . ,p j1, 2, . . . ,p k1, 2, . . . , p

■ T A B L E 4 . 1 0

Analysis of Variance for the Latin Square Design

Source of Sum of Degrees of Mean

Variation Squares Freedom Square F₀

Treatments p1

Rows p1

Columns p1

Error SS_E(by subtraction) (p2)(p1) Total SS_T

_k ^y²^ijk^yN²^... p²1

SS_E (p 2) (p1)

SS_Columns p1 SS_Columns1

p_k1

^p ^y²^..k^yN²^...

SS_Rows p1 SS_Rows1

p_i1

^p ^y²^i..^yN²^...

F₀MS_Treatments MS_E SS_Treatments

p1 SS_Treatments1

_j1^p ^y²^.j.^yN²^..

4.2 The Latin Square Design

161 E X A M P L E 4 . 3

Consider the rocket propellant problem previously described, where both batches of raw material and operators represent randomization restrictions. The design for this experiment, shown in Table 4.8, is a 55 Latin square. After coding by subtracting 25 from each observation, we have the data in Table 4.11. The sums of squares for the total, batches (rows), and operators (columns) are computed as follows:

The totals for the treatments (Latin letters) are Latin Letter Treatment Total

A B C D

E y_.5.5

y_.4.24 y_.3. 13 y_.2. 24 y_.1.18 (10)²

25 150.00 1

5 [(18)²18²(4)²5²9²] SS_Operators1

p_k1

^p ^y²^..k^yN²^...

(10)²

25 68.00 1

5 [(14)²9²5²3²7²] SS_Batches1

p_i1

^p ^y²^i..^yN²^...

680(10)²

25 676.00 SS_T

_k ^y²^ijk^yN²^...

The sum of squares resulting from the formulations is computed from these totals as

The error sum of squares is found by subtraction

The analysis of variance is summarized in Table 4.12. We conclude that there is a signiﬁcant difference in the mean burning rate generated by the different rocket propellant formulations. There is also an indication that differences between operators exist, so blocking on this factor was a good precaution. There is no strong evidence of a difference between batches of raw material, so it seems that in this particular experiment we were unnecessarily con- cerned about this source of variability. However, blocking on batches of raw material is usually a good idea.

676.00 68.00150.00330.00128.00 SS_ESS_TSS_BatchesSS_OperatorsSSFormulations

(10)²

25 330.00

18²(24)²(13)²24²5² 5

SSFormulations1

p_j1

^p ^y²^.j.^yN²^...

■ T A B L E 4 . 1 1

Coded Data for the Rocket Propellant Problem

Batches of Operators

Raw Material 1 2 3 4 5 y_i..

1 A 1 B 5 C 6 D 1 E 1 14

2 B 8 C 1 D5 E2 A11 9

3 C 7 D13 E1 A2 B 4 5

4 D1 E6 A1 B 2 C 3 3

5 E 3 A5 B 5 C4 D6 7

y_..k 18 18 4 5 9 10y_...

As in any design problem, the experimenter should investigate the adequacy of the model by inspecting and plotting the residuals. For a Latin square, the residuals are given by

The reader should ﬁnd the residuals for Example 4.3 and construct appropriate plots.

A Latin square in which the ﬁrst row and column consists of the letters written in alphabetical order is called a standard Latin square, which is the design shown in Example 4.4. A standard Latin square can always be obtained by writing the ﬁrst row in alphabetical order and then writing each successive row as the row of letters just above shifted one place to the left.

Table 4.13 summarizes several important facts about Latin squares and standard Latin squares.

As with any experimental design, the observations in the Latin square should be taken in random order. The proper randomization procedure is to select the particular square employed at random. As we see in Table 4.13, there are a large number of Latin squares of a particular size, so it is impossible to enumerate all the squares and select one randomly. The usual procedure is

y_ijky_i..y_.j.y_..k2y_...

e_ijky_ijkyˆ_ijk

■ T A B L E 4 . 1 2

Analysis of Variance for the Rocket Propellant Experiment

Source of Sum of Degrees of Mean

Variation Squares Freedom Square F₀ P-Value

Formulations 330.00 4 82.50 7.73 0.0025

Batches of raw material 68.00 4 17.00

Operators 150.00 4 37.50

Error 128.00 12 10.67

Total 676.00 24

■ T A B L E 4 . 1 3

Standard Latin Squares and Number of Latin Squares of Various Sizes^a

Size 33 44 55 66 77 pp

Examples of A B C A B C D A B C D E A B C D E F A B C D E F G ABC . . . P

standard squares B C A B C D A B A E C D B C F A D E B C D E F G A BCD . . . A C A B C D A B C D A E B C F B E A D C D E F G A B CDE . . . B

D A B C D E B A C D E A B F C D E F G A B C E C D B A E A D F C B E F G A B C D

F D E C B A F G A B C D E PAB. . . (P1) G A B C D E F

Number of 1 4 56 9408 16,942,080 —

standard squares

Total number of 12 576 161,280 818,851,200 61,479,419,904,000 p!(p1)!

Latin squares (number of

standard squares)

aSome of the information in this table is found in Fisher and Yates (1953). Little is known about the properties of Latin squares larger than 77.

to select an arbitrary Latin square from a table of such designs, as in Fisher and Yates (1953), or start with a standard square, and then arrange the order of the rows, columns, and letters at random. This is discussed more completely in Fisher and Yates (1953).

Occasionally, one observation in a Latin square is missing. For a ppLatin square, the missing value may be estimated by

(4.29) where the primes indicate totals for the row, column, and treatment with the missing value, and is the grand total with the missing value.

Latin squares can be useful in situations where the rows and columns represent factors the experimenter actually wishes to study and where there are no randomization restrictions.

Thus, three factors (rows, columns, and letters), each at plevels, can be investigated in only p²runs. This design assumes that there is no interaction between the factors. More will be said later on the subject of interaction.

Replication of Latin Squares. A disadvantage of small Latin squares is that they provide a relatively small number of error degrees of freedom. For example, a 33 Latin square has only two error degrees of freedom, a 44 Latin square has only six error degrees of freedom, and so forth. When small Latin squares are used, it is frequently desirable to replicate them to increase the error degrees of freedom.

A Latin square may be replicated in several ways. To illustrate, suppose that the 55 Latin square used in Example 4.4 is replicated ntimes. This could have been done as follows:

1. Use the same batches and operators in each replicate.

2. Use the same batches but different operators in each replicate (or, equivalently, use the same operators but different batches).

3. Use different batches and different operators.

The analysis of variance depends on the method of replication.

Consider case 1, where the same levels of the row and column blocking factors are used in each replicate. Let yijklbe the observation in row i, treatment j, column k, and replicate l.

There are Nnp²total observations. The ANOVA is summarized in Table 4.14.

y...

yijkp(yi..y.j.y...k)2y...

(p2)(p1)

4.2 The Latin Square Design

163

■ T A B L E 4 . 1 4

Analysis of Variance for a Replicated Latin Square, Case 1

Source of Sum of Degrees of Mean

Variation Squares Freedom Square F₀

Treatments p1

Rows p1

Columns p1

Replicates n1

Error Subtraction (p1)[n(p1)3]

Total

^y²^ijkl^yN²^.... np²1

SS_E

(p1)[n(p1)3]

SS_Replicates n1 1

p²_l1

ⁿ ^y²^...l^yN²^....

SS_Columns p1 1

np_k1

^p ^y²^..k.^yN²^....

SS_Rows p1 1

np_i1

^p ^y²^i...^yN²^....

MS_Treatments MS_E SS_Treatments

p1 1

np_j1

^p ^y²^.j..^yN²^....

Now consider case 2 and assume that new batches of raw material but the same operators are used in each replicate. Thus, there are now ﬁve new rows (in general,pnew rows) within each replicate. The ANOVA is summarized in Table 4.15. Note that the source of variation for the rows really measures the variation between rows within the nreplicates.

Finally, consider case 3, where new batches of raw material and new operators are used in each replicate. Now the variation that results from both the rows and columns measures the variation resulting from these factors within the replicates. The ANOVA is summarized in Table 4.16.

There are other approaches to analyzing replicated Latin squares that allow some inter- actions between treatments and squares (refer to Problem 4.30).

Crossover Designs and Designs Balanced for Residual Effects. Occasionally, one encounters a problem in which time periods are a factor in the experiment. In general, there are ptreatments to be tested in ptime periods using np experimental units. For example, a human performance analyst is studying the effect of two replacement ﬂuids on dehydration

■ T A B L E 4 . 1 5

Analysis of Variance for a Replicated Latin Square, Case 2

Source of Degrees of Mean

Variation Sum of Squares Freedom Square F₀

Treatments p1

Rows n(p1)

Columns p1

Replicates n1

Error Subtraction (p1)(np1)

Total

_l ^y²^ijkl^yN²^.... np²1

SS_E (p1)(np1)

SS_Replicates n1 1

p²

_l1ⁿ ^y²^...l^yN²^....

SS_Columns p1 1

np_k1

^p ^y²^..k.^yN²^....

SS_Rows n(p1) 1

p_l1

ⁿ

_i1^p ^y²^i..l_l1

ⁿ ^y_p²^...l²

MS_Treatments MS_E SS_Treatments

p1 1

np_j1

^p ^y²^.j..^yN²^....

■ T A B L E 4 . 1 6

Analysis of Variance for a Replicated Latin Square, Case 3

Source of Degrees of Mean

Variation Sum of Squares Freedom Square F₀

Treatments p1

Rows n(p1)

Columns n(p1)

Replicates n1

Error Subtraction (p1)[n(p1)1]

Total

_l ^y²^ijkl^yN²^.... np²1

SS_E

(p1)[n(p1)1]

SS_Replicates n1 1

p²

_l1ⁿ ^y²^...l^yN²^....

SS_Columns n(p1) 1

p_l1

ⁿ _k1

^p ^y²^..kl_l1

ⁿ ^y_p²^...l²

SS_Rows n(p1) 1

p_l1

ⁿ _i1

^p ^y²^i..l_l1

ⁿ ^y_p²^...l²

MS_Treatments MS_E SS_Treatments

p1 1

np_j1

^p ^y²^.j..^yN²^....

in 20 subjects. In the first period, half of the subjects (chosen at random) are given fluid Aand the other half fluid B. At the end of the period, the response is measured and a period of time is allowed to pass in which any physiological effect of the fluids is eliminated. Then the experimenter has the subjects who took fluid Atake fluid Band those who took fluid Btake fluidA. This design is called a crossover design. It is analyzed as a set of 10 Latin squares with two rows (time periods) and two treatments (fluid types). The two columns in each of the 10 squares correspond to subjects.

The layout of this design is shown in Figure 4.7. Notice that the rows in the Latin square represent the time periods and the columns represent the subjects. The 10 subjects who received ﬂuid Aﬁrst (1, 4, 6, 7, 9, 12, 13, 15, 17, and 19) are randomly determined.

An abbreviated analysis of variance is summarized in Table 4.17. The subject sum of squares is computed as the corrected sum of squares among the 20 subject totals, the period sum of squares is the corrected sum of squares among the rows, and the ﬂuid sum of squares is computed as the corrected sum of squares among the letter totals. For further details of the statistical analysis of these designs see Cochran and Cox (1957), John (1971), and Anderson and McLean (1974).

It is also possible to employ Latin square type designs for experiments in which the treatments have a residual effect—that is, for example, if the data for fluid Bin period 2 still reflected some effect of fluid Ataken in period 1. Designs balanced for residual effects are discussed in detail by Cochran and Cox (1957) and John (1971).

Dalam dokumen Design and Analysis - of Experiments - Spada UNS (Halaman 174-181)