• Tidak ada hasil yang ditemukan

The Latin Square Design

Cochran’s Theorem

4.2 The Latin Square Design

which is also a single-factor analysis. Again, by analogy with Equation 3.5, the reduction in the sum of squares for fitting this model is

withadegrees of freedom. The sum of squares for blocks {j} after fitting and {i} is

withb1 degrees of freedom, which we have given previously as Equation 4.11.

We have developed the sums of squares for treatments, blocks, and error in the random- ized complete block design using the general regression significance test. Although we would not ordinarily use the general regression significance test to actually analyze data in a ran- domized complete block, the procedure occasionally proves useful in more general random- ized block designs, such as those discussed in Section 4.4.

Exact Analysis of the Missing Value Problem. In Section 4.1.3 an approximate procedure for dealing with missing observations in the RCBD was presented. This approx- imate analysis consists of estimating the missing value so that the error mean square is minimized. It can be shown that the approximate analysis produces a biased mean square for treatments in the sense that E(MSTreatments) is larger than E(MSE) if the null hypothesis is true. Consequently, too many significant results are reported.

The missing value problem may be analyzed exactly by using the general regression significance test. The missing value causes the design to be unbalanced, and because all the treatments do not occur in all blocks, we say that the treatments and blocks are not orthog- onal. This method of analysis is also used in more general types of randomized block designs; it is discussed further in Section 4.4. Many computer packages will perform this analysis.

We see that both batches of raw material (rows) and operators (columns) are orthogonal to treatments.

The Latin square design is used to eliminate two nuisance sources of variability; that is, it systematically allows blocking in two directions. Thus, the rows and columns actually representtwo restrictions on randomization. In general, a Latin square for pfactors, or a ppLatin square, is a square containing prows and pcolumns. Each of the resulting p2cells contains one of the pletters that corresponds to the treatments, and each letter occurs once and only once in each row and column. Some examples of Latin squares are

44 55 66

A B D C A D B E C A D C E B F B C A D D A C B E B A E C F D C D B A C B E D A C E D F A B D A C B B E A C D D C F B E A E C D A B F B A D C E E F B A D C

Latin squares are closely related to a popular puzzle called a sudoku puzzle that origi- nated in Japan (sudoku means “single number” in Japanese). The puzzle typically consists of a 99 grid, with nine additional 33 blocks contained within. A few of the spaces contain numbers and the others are blank. The goal is to fill the blanks with the integers from 1 to 9 so that each row, each column, and each of the nine 33 blocks making up the grid contains just one of each of the nine integers. The additional constraint that a standard 99 sudoku puzzle have 33 blocks that also contain each of the nine integers reduces the large number of pos- sible 99 Latin squares to a smaller but still quite large number, approximately 61021.

Depending on the number of clues and the size of the grid, sudoku puzzles can be extremely difficult to solve. Solving an nnsudoku puzzle belongs to a class of computa- tional problems called NP-complete (the NPrefers to non-polynomial computing time). An NP-complete problem is one for which it’s relatively easy to check whether a particular answer is correct but may require an impossibly long time to solve by any simple algorithm as ngets larger.

Solving a sudoku puzzle is also equivalent to “coloring” a graph—an array of points (vertices) and lines (edges) in a particular way. In this case, the graph has 81 vertices, one for each cell of the grid. Depending on the puzzle, only certain pairs of vertices are joined by an edge. Given that some vertices have already been assigned a “color” (chosen from the nine number possibilities), the problem is to “color” the remaining vertices so that any two ver- tices joined by an edge don’t have the same “color.”

4.2 The Latin Square Design

159

T A B L E 4 . 9

Latin Square Design for the Rocket Propellant Problem Operators Batches of

Raw Material 1 2 3 4 5

1 A24 B20 C19 D24 E24

2 B17 C24 D30 E27 A36

3 C18 D38 E26 A27 B21

4 D26 E31 A26 B23 C22

5 E22 A30 B20 C29 D31

Thestatistical modelfor a Latin square is

(4.27)

whereyijkis the observation in the ith row and kth column for the jth treatment,is the over- all mean, iis the ith row effect,jis the jth treatment effect,kis the kth column effect, and ijkis the random error. Note that this is an effects model. The model is completely additive;

that is, there is no interaction between rows, columns, and treatments. Because there is only one observation in each cell, only two of the three subscripts i,j, and kare needed to denote a particular observation. For example, referring to the rocket propellant problem in Table 4.8, ifi2 and k3, we automatically find j4 (formulation D), and if i1 and j3 (for- mulationC), we find k3. This is a consequence of each treatment appearing exactly once in each row and column.

The analysis of variance consists of partitioning the total sum of squares of the Np2 observations into components for rows, columns, treatments, and error, for example,

(4.28) with respective degrees of freedom

Under the usual assumption that ijkis NID (0,2), each sum of squares on the right-hand side of Equation 4.28 is, upon division by 2, an independently distributed chi-square random vari- able. The appropriate statistic for testing for no differences in treatment means is

which is distributed as Fp1,(p2)(p1)under the null hypothesis. We may also test for no row effect and no column effect by forming the ratio of MSRowsorMSColumnstoMSE. However, because the rows and columns represent restrictions on randomization, these tests may not be appropriate.

The computational procedure for the ANOVA in terms of treatment, row, and column totals is shown in Table 4.10. From the computational formulas for the sums of squares, we see that the analysis is a simple extension of the RCBD, with the sum of squares resulting from rows obtained from the row totals.

F0MSTreatments

MSE

p21p1p1p1(p2)(p1) SSTSSRowsSSColumnsSSTreatmentsSSE

yijk ijkijk

i1, 2, . . . ,p j1, 2, . . . ,p k1, 2, . . . , p

T A B L E 4 . 1 0

Analysis of Variance for the Latin Square Design

Source of Sum of Degrees of Mean

Variation Squares Freedom Square F0

Treatments p1

Rows p1

Columns p1

Error SSE(by subtraction) (p2)(p1) Total SST

i

j

k y2ijkyN2... p21

SSE (p 2) (p1)

SSColumns p1 SSColumns1

pk1

p y2..kyN2...

SSRows p1 SSRows1

pi1

p y2i..yN2...

F0MSTreatments MSE SSTreatments

p1 SSTreatments1

p

j1p y2.j.yN2..

4.2 The Latin Square Design

161

E X A M P L E 4 . 3

Consider the rocket propellant problem previously described, where both batches of raw material and opera- tors represent randomization restrictions. The design for this experiment, shown in Table 4.8, is a 55 Latin square. After coding by subtracting 25 from each observa- tion, we have the data in Table 4.11. The sums of squares for the total, batches (rows), and operators (columns) are computed as follows:

The totals for the treatments (Latin letters) are Latin Letter Treatment Total

A B C D

E y.5. 5

y.4. 24 y.3. 13 y.2. 24 y.1. 18 (10)2

25 150.00 1

5 [(18)2182(4)25292] SSOperators1

pk1

p y2..kyN2...

(10)2

25 68.00 1

5 [(14)292523272] SSBatches1

pi1

p y2i..yN2...

680(10)2

25 676.00 SST

i

j

k y2ijkyN2...

The sum of squares resulting from the formulations is com- puted from these totals as

The error sum of squares is found by subtraction

The analysis of variance is summarized in Table 4.12. We conclude that there is a significant difference in the mean burning rate generated by the different rocket propellant formulations. There is also an indication that differences between operators exist, so blocking on this factor was a good precaution. There is no strong evidence of a differ- ence between batches of raw material, so it seems that in this particular experiment we were unnecessarily con- cerned about this source of variability. However, blocking on batches of raw material is usually a good idea.

676.00 68.00150.00330.00128.00 SSESSTSSBatchesSSOperatorsSSFormulations

(10)2

25 330.00

182(24)2(13)224252 5

SSFormulations1

pj1

p y2.j.yN2...

T A B L E 4 . 1 1

Coded Data for the Rocket Propellant Problem

Batches of Operators

Raw Material 1 2 3 4 5 yi..

1 A 1 B 5 C 6 D 1 E 1 14

2 B 8 C 1 D5 E2 A11 9

3 C 7 D13 E1 A2 B 4 5

4 D1 E6 A1 B 2 C 3 3

5 E 3 A5 B 5 C4 D6 7

y..k 18 18 4 5 9 10y...

As in any design problem, the experimenter should investigate the adequacy of the model by inspecting and plotting the residuals. For a Latin square, the residuals are given by

The reader should find the residuals for Example 4.3 and construct appropriate plots.

A Latin square in which the first row and column consists of the letters written in alpha- betical order is called a standard Latin square, which is the design shown in Example 4.4. A standard Latin square can always be obtained by writing the first row in alphabetical order and then writing each successive row as the row of letters just above shifted one place to the left.

Table 4.13 summarizes several important facts about Latin squares and standard Latin squares.

As with any experimental design, the observations in the Latin square should be taken in random order. The proper randomization procedure is to select the particular square employed at random. As we see in Table 4.13, there are a large number of Latin squares of a particular size, so it is impossible to enumerate all the squares and select one randomly. The usual procedure is

yijkyi..y.j.y..k2y...

eijkyijkyˆijk

T A B L E 4 . 1 2

Analysis of Variance for the Rocket Propellant Experiment

Source of Sum of Degrees of Mean

Variation Squares Freedom Square F0 P-Value

Formulations 330.00 4 82.50 7.73 0.0025

Batches of raw material 68.00 4 17.00

Operators 150.00 4 37.50

Error 128.00 12 10.67

Total 676.00 24

T A B L E 4 . 1 3

Standard Latin Squares and Number of Latin Squares of Various Sizesa

Size 33 44 55 66 77 pp

Examples of A B C A B C D A B C D E A B C D E F A B C D E F G ABC . . . P

standard squares B C A B C D A B A E C D B C F A D E B C D E F G A BCD . . . A C A B C D A B C D A E B C F B E A D C D E F G A B CDE . . . B

D A B C D E B A C D E A B F C D E F G A B C E C D B A E A D F C B E F G A B C D

F D E C B A F G A B C D E PAB. . . (P1) G A B C D E F

Number of 1 4 56 9408 16,942,080

standard squares

Total number of 12 576 161,280 818,851,200 61,479,419,904,000 p!(p1)!

Latin squares (number of

standard squares)

aSome of the information in this table is found in Fisher and Yates (1953). Little is known about the properties of Latin squares larger than 77.

o

to select an arbitrary Latin square from a table of such designs, as in Fisher and Yates (1953), or start with a standard square, and then arrange the order of the rows, columns, and letters at random. This is discussed more completely in Fisher and Yates (1953).

Occasionally, one observation in a Latin square is missing. For a ppLatin square, the missing value may be estimated by

(4.29) where the primes indicate totals for the row, column, and treatment with the missing value, and is the grand total with the missing value.

Latin squares can be useful in situations where the rows and columns represent factors the experimenter actually wishes to study and where there are no randomization restrictions.

Thus, three factors (rows, columns, and letters), each at plevels, can be investigated in only p2runs. This design assumes that there is no interaction between the factors. More will be said later on the subject of interaction.

Replication of Latin Squares. A disadvantage of small Latin squares is that they provide a relatively small number of error degrees of freedom. For example, a 33 Latin square has only two error degrees of freedom, a 44 Latin square has only six error degrees of freedom, and so forth. When small Latin squares are used, it is frequently desirable to repli- cate them to increase the error degrees of freedom.

A Latin square may be replicated in several ways. To illustrate, suppose that the 55 Latin square used in Example 4.4 is replicated ntimes. This could have been done as follows:

1. Use the same batches and operators in each replicate.

2. Use the same batches but different operators in each replicate (or, equivalently, use the same operators but different batches).

3. Use different batches and different operators.

The analysis of variance depends on the method of replication.

Consider case 1, where the same levels of the row and column blocking factors are used in each replicate. Let yijklbe the observation in row i, treatment j, column k, and replicate l.

There are Nnp2total observations. The ANOVA is summarized in Table 4.14.

y...

yijkp(yi..y.j.y...k)2y...

(p2)(p1)

4.2 The Latin Square Design

163

T A B L E 4 . 1 4

Analysis of Variance for a Replicated Latin Square, Case 1

Source of Sum of Degrees of Mean

Variation Squares Freedom Square F0

Treatments p1

Rows p1

Columns p1

Replicates n1

Error Subtraction (p1)[n(p1)3]

Total

y2ijklyN2.... np21

SSE

(p1)[n(p1)3]

SSReplicates n1 1

p2l1

n y2...lyN2....

SSColumns p1 1

npk1

p y2..k.yN2....

SSRows p1 1

npi1

p y2i...yN2....

MSTreatments MSE SSTreatments

p1 1

npj1

p y2.j..yN2....

Now consider case 2 and assume that new batches of raw material but the same opera- tors are used in each replicate. Thus, there are now five new rows (in general,pnew rows) within each replicate. The ANOVA is summarized in Table 4.15. Note that the source of vari- ation for the rows really measures the variation between rows within the nreplicates.

Finally, consider case 3, where new batches of raw material and new operators are used in each replicate. Now the variation that results from both the rows and columns measures the vari- ation resulting from these factors within the replicates. The ANOVA is summarized in Table 4.16.

There are other approaches to analyzing replicated Latin squares that allow some inter- actions between treatments and squares (refer to Problem 4.30).

Crossover Designs and Designs Balanced for Residual Effects. Occasionally, one encounters a problem in which time periods are a factor in the experiment. In general, there are ptreatments to be tested in ptime periods using np experimental units. For example, a human performance analyst is studying the effect of two replacement fluids on dehydration

T A B L E 4 . 1 5

Analysis of Variance for a Replicated Latin Square, Case 2

Source of Degrees of Mean

Variation Sum of Squares Freedom Square F0

Treatments p1

Rows n(p1)

Columns p1

Replicates n1

Error Subtraction (p1)(np1)

Total

i

j

k

l y2ijklyN2.... np21

SSE (p1)(np1)

SSReplicates n1 1

p2

l1n y2...lyN2....

SSColumns p1 1

npk1

p y2..k.yN2....

SSRows n(p1) 1

pl1

n

i1p y2i..ll1

n yp2...l2

MSTreatments MSE SSTreatments

p1 1

npj1

p y2.j..yN2....

T A B L E 4 . 1 6

Analysis of Variance for a Replicated Latin Square, Case 3

Source of Degrees of Mean

Variation Sum of Squares Freedom Square F0

Treatments p1

Rows n(p1)

Columns n(p1)

Replicates n1

Error Subtraction (p1)[n(p1)1]

Total

i

j

k

l y2ijklyN2.... np21

SSE

(p1)[n(p1)1]

SSReplicates n1 1

p2

l1n y2...lyN2....

SSColumns n(p1) 1

pl1

n k1

p y2..kll1

n yp2...l2

SSRows n(p1) 1

pl1

n i1

p y2i..ll1

n yp2...l2

MSTreatments MSE SSTreatments

p1 1

npj1

p y2.j..yN2....

in 20 subjects. In the first period, half of the subjects (chosen at random) are given fluid Aand the other half fluid B. At the end of the period, the response is measured and a period of time is allowed to pass in which any physiological effect of the fluids is eliminated. Then the experimenter has the subjects who took fluid Atake fluid Band those who took fluid Btake fluidA. This design is called a crossover design. It is analyzed as a set of 10 Latin squares with two rows (time periods) and two treatments (fluid types). The two columns in each of the 10 squares correspond to subjects.

The layout of this design is shown in Figure 4.7. Notice that the rows in the Latin square represent the time periods and the columns represent the subjects. The 10 subjects who received fluid Afirst (1, 4, 6, 7, 9, 12, 13, 15, 17, and 19) are randomly determined.

An abbreviated analysis of variance is summarized in Table 4.17. The subject sum of squares is computed as the corrected sum of squares among the 20 subject totals, the period sum of squares is the corrected sum of squares among the rows, and the fluid sum of squares is computed as the corrected sum of squares among the letter totals. For further details of the statistical analysis of these designs see Cochran and Cox (1957), John (1971), and Anderson and McLean (1974).

It is also possible to employ Latin square type designs for experiments in which the treatments have a residual effect—that is, for example, if the data for fluid Bin period 2 still reflected some effect of fluid Ataken in period 1. Designs balanced for residual effects are discussed in detail by Cochran and Cox (1957) and John (1971).