The Central Limit Theorem
2.7 Problems
The 100(1 ) confidence interval for the ratio of the population variances is (2.50) To illustrate the use of Equation 2.50, the 95 percent confidence interval for the ratio of vari- ances in Example 2.2 is, using F0.025,9,113.59 and F0.975,9,111/F0.025,11,91/3.92 0.255,
2.7 Problems
0.34 21
22
4.82 14.5
10.8 (0.255) 21
22
14.5 10.8 (3.59) 21/22
S21
S22F1 /2,n21,n11 21
22
S21
S22F /2,n21,n11
21/22
2.1. Computer output for a random sample of data is shown below. Some of the quantities are missing. Compute the values of the missing quantities.
Variable N Mean SE Mean Std. Dev. Variance Minimum Maximum
Y 9 19.96 ? 3.12 ? 15.94 27.16
2.2. Computer output for a random sample of data is shown below. Some of the quantities are missing. Compute the values of the missing quantities.
Variable N Mean SE Mean Std. Dev. Sum
Y 16 ? 0.159 ? 399.851
2.3. Suppose that we are testing H0: 0 versus H1: 0. Calculate the P-value for the following observed values of the test statistic:
(a) Z02.25 (b)Z01.55 (c)Z02.10 (d) Z01.95 (e)Z0 0.10
2.4. Suppose that we are testing H0: 0 versus H1: 0. Calculate the P-value for the following observed values of the test statistic:
(a) Z02.45 (b)Z0 1.53 (c)Z02.15 (d) Z01.95 (e)Z0 0.25
2.5. Consider the computer output shown below.
One-Sample Z
Test of mu30 vs not30
The assumed standard deviation1.2
N Mean SE Mean 95% CI Z P
16 31.2000 0.3000 (30.6120, 31.7880) ? ?
(a) Fill in the missing values in the output. What conclu- sion would you draw?
(b) Is this a one-sided or two-sided test?
(c) Use the output and the normal table to find a 99 percent CI on the mean.
(d) What is the P-value if the alternative hypothesis is H1:30?
2.6. Suppose that we are testing H0:1 2versus H0: 1 2where the two sample sizes are n1n212. Both sample variances are unknown but assumed equal. Find bounds on the P-value for the following observed values of the test statistic.
(a)t02.30 (b)t03.41 (c)t01.95 (d)t0 2.45 2.7. Suppose that we are testing H0:1 2versus H0: 1 2where the two sample sizes are n1n210. Both sample variances are unknown but assumed equal. Find bounds on the P-value for the following observed values of the test statistic.
(a)t02.31 (b)t03.60 (c)t01.95 (d)t02.19 2.8. Consider the following sample data: 9.37, 13.04, 11.69, 8.21, 11.18, 10.41, 13.15, 11.51, 13.21, and 7.75. Is it reasonable to assume that this data is a sample from a normal distribution? Is there evidence to support a claim that the mean of the population is 10?
2.9. A computer program has produced the following out- put for a hypothesis-testing problem:
Difference in sample means: 2.35 Degrees of freedom: 18
Standard error of the difference in sample means: ? Test statistic: t0 = 2.01
P-value: 0.0298
(a) What is the missing value for the standard error?
(b) Is this a two-sided or a one-sided test?
(c) If 0.05, what are your conclusions?
(d) Find a 90% two-sided CI on the difference in means.
2.10. A computer program has produced the following out- put for a hypothesis-testing problem:
Difference in sample means: 11.5 Degrees of freedom: 24
Standard error of the difference in sample means: ? Test statistic: t0 = -1.88
P-value: 0.0723
(a) What is the missing value for the standard error?
(b) Is this a two-sided or a one-sided test?
(c) If 0.05, what are your conclusions?
(d) Find a 95% two-sided CI on the difference in means.
2.11. Suppose that we are testing H0:0 versus H1:0with a sample size of n15. Calculate bounds on the P-value for the following observed values of the test statistic:
(a)t02.35 (b)t03.55 (c) t02.00 (d)t01.55 2.12. Suppose that we are testing H0:0 versus H1: 0with a sample size of n10. Calculate bounds on the P-value for the following observed values of the test statistic:
(a) t02.48 (b)t0 3.95 (c)t02.69 (d) t01.88 (e)t0 1.25
2.13. Consider the computer output shown below.
One-Sample T: Y
Test of mu91 vs. not91
Variable N Mean Std. Dev. SE Mean 95% CI T P Y 25 92.5805 ? 0.4673 (91.6160, ?) 3.38 0.002
(a) Fill in the missing values in the output. Can the null hypothesis be rejected at the 0.05 level? Why?
(b) Is this a one-sided or a two-sided test?
(c) If the hypotheses had been H0: 90 versus H1: 90 would you reject the null hypothesis at the 0.05 level?
(d) Use the output and the ttable to find a 99 percent two- sided CI on the mean.
(e) What is the P-value if the alternative hypothesis is H1: 91?
2.14. Consider the computer output shown below.
One-Sample T: Y
Test of mu25 vs 25
95% Lower Variable N Mean Std. Dev. SE Mean Bound T P
Y 12 25.6818 ? 0.3360 ? ? 0.034
(a) How many degrees of freedom are there on the t-test statistic?
(b) Fill in the missing information.
2.15. Consider the computer output shown below.
Two-Sample T-Test and Cl: Y1, Y2
Two-sample T for Y1 vs Y2
N Mean Std. Dev. SE Mean
Y1 20 50.19 1.71 0.38
Y2 20 52.52 2.48 0.55
Differencemu (X1)mu (X2) Estimate for difference: 2.33341
95% CI for difference: (3.69547, 0.97135) T-Test of difference0 (vs not ) : T-Value 3.47 P-Value0.001 DF38
Both use Pooled Std. Dev.2.1277
(a) Can the null hypothesis be rejected at the 0.05 level?
Why?
(b) Is this a one-sided or a two-sided test?
(c) If the hypotheses had been H0:122 versus H1:12 2 would you reject the null hypothesis at the 0.05 level?
(d) If the hypotheses had been H0:122 versus H1:122 would you reject the null hypothesis at the 0.05 level? Can you answer this question with- out doing any additional calculations? Why?
(e) Use the output and the ttable to find a 95 percent upper confidence bound on the difference in means.
(f) What is the P-value if the hypotheses are H0:1 22 versus H1:12 2?
2.16. The breaking strength of a fiber is required to be at least 150 psi. Past experience has indicated that the standard deviation of breaking strength is 3 psi. A random sample of four specimens is tested, and the results are y1145,y2 153,y3150, and y4147.
(a) State the hypotheses that you think should be tested in this experiment.
(b) Test these hypotheses using 0.05. What are your conclusions?
(c) Find the P-value for the test in part (b).
(d) Construct a 95 percent confidence interval on the mean breaking strength.
2.17. The viscosity of a liquid detergent is supposed to average 800 centistokes at 25°C. A random sample of 16 batches of detergent is collected, and the average viscosity is 812. Suppose we know that the standard deviation of viscosity is25 centistokes.
(a) State the hypotheses that should be tested.
(b) Test these hypotheses using 0.05. What are your conclusions?
(c) What is the P-value for the test?
(d) Find a 95 percent confidence interval on the mean.
2.18. The diameters of steel shafts produced by a certain manufacturing process should have a mean diameter of 0.255 inches. The diameter is known to have a standard deviation of = 0.0001 inch. A random sample of 10 shafts has an aver- age diameter of 0.2545 inch.
2.7 Problems
61
(a) Set up appropriate hypotheses on the mean . (b) Test these hypotheses using 0.05. What are your
conclusions?
(c) Find the P-value for this test.
(d) Construct a 95 percent confidence interval on the mean shaft diameter.
2.19. A normally distributed random variable has an unknown mean and a known variance 29. Find the sam- ple size required to construct a 95 percent confidence interval on the mean that has total length of 1.0.
2.20. The shelf life of a carbonated beverage is of interest.
Ten bottles are randomly selected and tested, and the follow- ing results are obtained:
Days
108 138
124 163
124 159
106 134
115 139
(a) We would like to demonstrate that the mean shelf life exceeds 120 days. Set up appropriate hypotheses for investigating this claim.
(b) Test these hypotheses using 0.01. What are your conclusions?
(c) Find the P-value for the test in part (b).
(d) Construct a 99 percent confidence interval on the mean shelf life.
2.21. Consider the shelf life data in Problem 2.20. Can shelf life be described or modeled adequately by a normal distribu- tion? What effect would the violation of this assumption have on the test procedure you used in solving Problem 2.15?
2.22. The time to repair an electronic instrument is a normal- ly distributed random variable measured in hours. The repair times for 16 such instruments chosen at random are as follows:
Hours
159 280 101 212
224 379 179 264
222 362 168 250
149 260 485 170
(a) You wish to know if the mean repair time exceeds 225 hours. Set up appropriate hypotheses for investigating this issue.
(b) Test the hypotheses you formulated in part (a). What are your conclusions? Use 0.05.
(c) Find the P-value for the test.
(d) Construct a 95 percent confidence interval on mean repair time.
2.23. Reconsider the repair time data in Problem 2.22. Can repair time, in your opinion, be adequately modeled by a nor- mal distribution?
2.24. Two machines are used for filling plastic bottles with a net volume of 16.0 ounces. The filling processes can be assumed to be normal, with standard deviations of 10.015 and20.018. The quality engineering department suspects that both machines fill to the same net volume, whether or not this volume is 16.0 ounces. An experiment is performed by taking a random sample from the output of each machine.
Machine 1 Machine 2
16.03 16.01 16.02 16.03
16.04 15.96 15.97 16.04
16.05 15.98 15.96 16.02
16.05 16.02 16.01 16.01
16.02 15.99 15.99 16.00
(a) State the hypotheses that should be tested in this experiment.
(b) Test these hypotheses using 0.05. What are your conclusions?
(c) Find the P-value for this test.
(d) Find a 95 percent confidence interval on the difference in mean fill volume for the two machines.
2.25. Two types of plastic are suitable for use by an elec- tronic calculator manufacturer. The breaking strength of this plastic is important. It is known that 121.0 psi. From random samples of n110 and n212 we obtain 162.5 and 155.0. The company will not adopt plastic 1 unless its breaking strength exceeds that of plastic 2 by at least 10 psi. Based on the sample information, should they use plastic 1? In answering this question, set up and test appropriate hypotheses using 0.01. Construct a 99 percent confidence interval on the true mean difference in breaking strength.
2.26. The following are the burning times (in minutes) of chemical flares of two different formulations. The design engineers are interested in both the mean and variance of the burning times.
Type 1 Type 2
65 82 64 56
81 67 71 69
57 59 83 74
66 75 59 82
82 70 65 79
(a) Test the hypothesis that the two variances are equal.
Use 0.05.
(b) Using the results of (a), test the hypothesis that the mean burning times are equal. Use 0.05. What is theP-value for this test?
y2
y1
(c) Discuss the role of the normality assumption in this problem. Check the assumption of normality for both types of flares.
2.27. An article in Solid State Technology, “Orthogonal Design for Process Optimization and Its Application to Plasma Etching” by G. Z. Yin and D. W. Jillie (May 1987) describes an experiment to determine the effect of the C2F6 flow rate on the uniformity of the etch on a silicon wafer used in integrated circuit manufacturing. All of the runs were made in random order. Data for two flow rates are as follows:
C2F6Flow Uniformity Observation
(SCCM) 1 2 3 4 5 6
125 2.7 4.6 2.6 3.0 3.2 3.8
200 4.6 3.4 2.9 3.5 4.1 5.1
(a) Does the C2F6flow rate affect average etch uniformi- ty? Use 0.05.
(b) What is the P-value for the test in part (a)?
(c) Does the C2F6flow rate affect the wafer-to-wafer vari- ability in etch uniformity? Use 0.05.
(d) Draw box plots to assist in the interpretation of the data from this experiment.
2.28. A new filtering device is installed in a chemical unit.
Before its installation, a random sample yielded the follow- ing information about the percentage of impurity: 12.5, 101.17, and n18. After installation, a random sample yielded 10.2, 94.73,n29.
(a) Can you conclude that the two variances are equal?
Use 0.05.
(b) Has the filtering device reduced the percentage of impurity significantly? Use 0.05.
2.29. Photoresist is a light-sensitive material applied to semiconductor wafers so that the circuit pattern can be imaged on to the wafer. After application, the coated wafers are baked to remove the solvent in the photoresist mixture and to harden the resist. Here are measurements of photore- sist thickness (in kA) for eight wafers baked at two differ- ent temperatures. Assume that all of the runs were made in random order.
95C 100 C
11.176 5.263
7.089 6.748
8.097 7.461
11.739 7.015
11.291 8.133
10.759 7.418
6.467 3.772
8.315 8.963
S22 y2
S21
y1
(a) Is there evidence to support the claim that the high- er baking temperature results in wafers with a lower mean photoresist thickness? Use 0.05.
(b) What is the P-value for the test conducted in part (a)?
(c) Find a 95 percent confidence interval on the difference in means. Provide a practical interpretation of this interval.
(d) Draw dot diagrams to assist in interpreting the results from this experiment.
(e) Check the assumption of normality of the photoresist thickness.
(f) Find the power of this test for detecting an actual dif- ference in means of 2.5 kA.
(g) What sample size would be necessary to detect an actual difference in means of 1.5 kA with a power of at least 0.9?
2.30. Front housings for cell phones are manufactured in an injection molding process. The time the part is allowed to cool in the mold before removal is thought to influence the occurrence of a particularly troublesome cosmetic defect, flow lines, in the finished housing. After manufac- turing, the housings are inspected visually and assigned a score between 1 and 10 based on their appearance, with 10 corresponding to a perfect part and 1 corresponding to a completely defective part. An experiment was conducted using two cool-down times, 10 and 20 seconds, and 20 housings were evaluated at each level of cool-down time.
All 40 observations in this experiment were run in random order. The data are as follows.
10 seconds 20 seconds
1 3 7 6
2 6 8 9
1 5 5 5
3 3 9 7
5 2 5 4
1 1 8 6
5 6 6 8
2 8 4 5
3 2 6 8
5 3 7 7
(a) Is there evidence to support the claim that the longer cool-down time results in fewer appearance defects?
Use 0.05.
(b) What is the P-value for the test conducted in part (a)?
(c) Find a 95 percent confidence interval on the difference in means. Provide a practical interpretation of this interval.
(d) Draw dot diagrams to assist in interpreting the results from this experiment.
(e) Check the assumption of normality for the data from this experiment.
2.31. Twenty observations on etch uniformity on silicon wafers are taken during a qualification experiment for a plas- ma etcher. The data are as follows:
5.34 6.65 4.76 5.98 7.25
6.00 7.55 5.54 5.62 6.21
5.97 7.35 5.44 4.39 4.98
5.25 6.35 4.61 6.00 5.32
(a) Construct a 95 percent confidence interval estimate of2.
(b) Test the hypothesis that 21.0. Use 0.05. What are your conclusions?
(c) Discuss the normality assumption and its role in this problem.
(d) Check normality by constructing a normal probability plot. What are your conclusions?
2.32. The diameter of a ball bearing was measured by 12 inspectors, each using two different kinds of calipers. The results were
Inspector Caliper 1 Caliper 2
1 0.265 0.264
2 0.265 0.265
3 0.266 0.264
4 0.267 0.266
5 0.267 0.267
6 0.265 0.268
7 0.267 0.264
8 0.267 0.265
9 0.265 0.265
10 0.268 0.267
11 0.268 0.268
12 0.265 0.269
(a) Is there a significant difference between the means of the population of measurements from which the two samples were selected? Use 0.05.
(b) Find the P-value for the test in part (a).
(c) Construct a 95 percent confidence interval on the dif- ference in mean diameter measurements for the two types of calipers.
2.33. An article in the journal Neurology(1998, Vol. 50, pp.
1246–1252) observed that monozygotic twins share numerous physical, psychological, and pathological traits. The investi- gators measured an intelligence score of 10 pairs of twins.
The data obtained are as follows:
Pair Birth Order: 1 Birth Order: 2
1 6.08 5.73
2 6.22 5.80
2.7 Problems
63
3 7.99 8.42
4 7.44 6.84
5 6.48 6.43
6 7.99 8.76
7 6.32 6.32
8 7.60 7.62
9 6.03 6.59
10 7.52 7.67
(a) Is the assumption that the difference in score is nor- mally distributed reasonable?
(b) Find a 95% confidence interval on the difference in mean score. Is there any evidence that mean score depends on birth order?
(c) Test an appropriate set of hypotheses indicating that the mean score does not depend on birth order.
2.34. An article in the Journal of Strain Analysis(vol. 18, no. 2, 1983) compares several procedures for predicting the shear strength for steel plate girders. Data for nine girders in the form of the ratio of predicted to observed load for two of these procedures, the Karlsruhe and Lehigh methods, are as follows:
Girder Karlsruhe Method Lehigh Method
S1/1 1.186 1.061
S2/1 1.151 0.992
S3/1 1.322 1.063
S4/1 1.339 1.062
S5/1 1.200 1.065
S2/1 1.402 1.178
S2/2 1.365 1.037
S2/3 1.537 1.086
S2/4 1.559 1.052
(a) Is there any evidence to support a claim that there is a difference in mean performance between the two methods? Use 0.05.
(b) What is the P-value for the test in part (a)?
(c) Construct a 95 percent confidence interval for the dif- ference in mean predicted to observed load.
(d) Investigate the normality assumption for both samples.
(e) Investigate the normality assumption for the difference in ratios for the two methods.
(f) Discuss the role of the normality assumption in the pairedt-test.
2.35. The deflection temperature under load for two differ- ent formulations of ABS plastic pipe is being studied. Two samples of 12 observations each are prepared using each for- mulation and the deflection temperatures (in °F) are reported below:
Formulation 1 Formulation 2
206 193 192 177 176 198
188 207 210 197 185 188
205 185 194 206 200 189
187 189 178 201 197 203
(a) Construct normal probability plots for both samples.
Do these plots support assumptions of normality and equal variance for both samples?
(b) Do the data support the claim that the mean deflection temperature under load for formulation 1 exceeds that of formulation 2? Use 0.05.
(c) What is the P-value for the test in part (a)?
2.36. Refer to the data in Problem 2.35. Do the data support a claim that the mean deflection temperature under load for formulation 1 exceeds that of formulation 2 by at least 3°F?
2.37. In semiconductor manufacturing wet chemical etching is often used to remove silicon from the backs of wafers prior to metalization. The etch rate is an important characteristic of this process. Two different etching solutions are being evaluated. Eight randomly selected wafers have been etched in each solution, and the observed etch rates (in mils/min) are as follows.
Solution 1 Solution 2
9.9 10.6 10.2 10.6
9.4 10.3 10.0 10.2
10.0 9.3 10.7 10.4
10.3 9.8 10.5 10.3
(a) Do the data indicate that the claim that both solutions have the same mean etch rate is valid? Use 0.05 and assume equal variances.
(b) Find a 95 percent confidence interval on the difference in mean etch rates.
(c) Use normal probability plots to investigate the adequa- cy of the assumptions of normality and equal variances.
2.38. Two popular pain medications are being compared on the basis of the speed of absorption by the body.
Specifically, tablet 1 is claimed to be absorbed twice as fast as tablet 2. Assume that and are known. Develop a test statistic for
2.39. Continuation of Problem 2.38. An article in Nature (1972, pp. 225–226) reported on the levels of monoamine oxi- dase in blood platelets for a sample of 43 schizophrenic
H1⬊21 Z 2
H0⬊212
22
21
patients resulting in 2.69 and s12.30 while for a sam- ple of 45 normal patients the results were 6.35 and s2 4.03. The units are nm/mg protein/h. Use the results of the previous problem to test the claim that the mean monoamine oxidase level for normal patients is at last twice the mean level for schizophrenic patients. Assume that the sample sizes are large enough to use the sample standard deviations as the true parameter values.
2.40. Suppose we are testing
where > are known. Our sampling resources are con- strained such that n1n2N. Show that an allocation of the observation n1n2to the two samp that lead the most powerful test is in the ratio n1/n2 1/2.
2.41. Continuation of Problem 2.40. Suppose that we want to construct a 95% two-sided confidence interval on the difference in two means where the two sample standard devi- ations are known to be 14 and 28. The total sample size is restricted to N30. What is the length of the 95% CI if the sample sizes used by the experimenter are n1n215?
How much shorter would the 95% CI have been if the exper- imenter had used an optimal sample size allocation?
2.42. Develop Equation 2.46 for a 100(1 ) percent con- fidence interval for the variance of a normal distribution.
2.43. Develop Equation 2.50 for a 100(1 ) percent con- fidence interval for the ratio , where and are the variances of two normal distributions.
2.44. Develop an equation for finding a 100 (1 ) percent confidence interval on the difference in the means of two nor- mal distributions where . Apply your equation to the Portland cement experiment data, and find a 95 percent confi- dence interval.
2.45. Construct a data set for which the paired t-test statis- tic is very large, but for which the usual two-sample or pooled t-test statistic is small. In general, describe how you created the data. Does this give you any insight regarding how the pairedt-test works?
2.46. Consider the experiment described in Problem 2.26.
If the mean burning times of the two flares differ by as much as 2 minutes, find the power of the test. What sample size would be required to detect an actual difference in mean burn- ing time of 1 minute with a power of at least 0.90?
2.47. Reconsider the bottle filling experiment described in Problem 2.24. Rework this problem assuming that the two population variances are unknown but equal.
2.48. Consider the data from Problem 2.24. If the mean fill volume of the two machines differ by as much as 0.25 ounces, what is the power of the test used in Problem 2.19? What sam- ple size would result in a power of at least 0.9 if the actual dif- ference in mean fill volume is 0.25 ounces?
21 Z22
22
21
21/22
22
21
H1⬊1 Z 2
H0⬊12
y2
y1
65
C H A P T E R 3
E x p e r i m e n t s w i t h a S i n g l e F a c t o r : T h e A n a l y s i s
o f V a r i a n c e
CHAPTER OUTLINE
3.1 AN EXAMPLE
3.2 THE ANALYSIS OF VARIANCE
3.3 ANALYSIS OF THE FIXED EFFECTS MODEL 3.3.1 Decomposition of the Total Sum of
Squares
3.3.2 Statistical Analysis
3.3.3 Estimation of the Model Parameters 3.3.4 Unbalanced Data
3.4 MODEL ADEQUACY CHECKING 3.4.1 The Normality Assumption
3.4.2 Plot of Residuals in Time Sequence 3.4.3 Plot of Residuals Versus Fitted Values 3.4.4 Plots of Residuals Versus Other Variables 3.5 PRACTICAL INTERPRETATION OF RESULTS
3.5.1 A Regression Model
3.5.2 Comparisons Among Treatment Means 3.5.3 Graphical Comparisons of Means 3.5.4 Contrasts
3.5.5 Orthogonal Contrasts
3.5.6 Scheffé’s Method for Comparing All Contrasts
3.5.7 Comparing Pairs of Treatment Means 3.5.8 Comparing Treatment Means with a
Control
3.6 SAMPLE COMPUTER OUTPUT 3.7 DETERMINING SAMPLE SIZE
3.7.1 Operating Characteristic Curves
3.7.2 Specifying a Standard Deviation Increase 3.7.3 Confidence Interval Estimation Method
3.8 OTHER EXAMPLES OF SINGLE-FACTOR EXPERIMENTS
3.8.1 Chocolate and Cardiovascular Health 3.8.2 A Real Economy Application of a
Designed Experiment 3.8.3 Analyzing Dispersion Effects 3.9 THE RANDOM EFFECTS MODEL
3.9.1 A Single Random Factor
3.9.2 Analysis of Variance for the Random Model 3.9.3 Estimating the Model Parameters
3.10 THE REGRESSION APPROACH TO THE ANALYSIS OF VARIANCE
3.10.1 Least Squares Estimation of the Model Parameters
3.10.2 The General Regression Significance Test 3.11 NONPARAMETRIC METHODS IN THE
ANALYSIS OF VARIANCE 3.11.1 The Kruskal–Wallis Test 3.11.2 General Comments on the Rank
Transformation
SUPPLEMENTAL MATERIAL FOR CHAPTER 3 S3.1 The Definition of Factor Effects
S3.2 Expected Mean Squares S3.3 Confidence Interval for 2
S3.4 Simultaneous Confidence Intervals on Treatment Means
S3.5 Regression Models for a Quantitative Factor S3.6 More about Estimable Functions
S3.7 Relationship Between Regression and Analysis of Variance
The supplemental material is on the textbook website www.wiley.com/college/montgomery.