Chapter 7
HYPOTHESIS TESTING
© FAROUQ MOHAMMAD A. ALAM 1
Learning Outcomes
› After studying this chapter, the student will:
1. understand how to correctly state a null and alternative hypothesis and carry out a structured hypothesis test.
2. understand the concepts of type I error, and type II error of a test.
3. understand how to calculate and interpret p-values.
© FAROUQ MOHAMMAD A. ALAM 2
7.1. INTRODUCTION
© FAROUQ MOHAMMAD A. ALAM 3
Basic Concepts
› A hypothesis may be defined simply as a statement about one or more populations.
› Types of hypotheses:
– The research hypothesis is the conjecture or supposition that motivates the research.
– Statistical hypotheses are hypotheses that are stated in such a way that they may be evaluated by appropriate statistical techniques.
© FAROUQ MOHAMMAD A. ALAM 4
Basic Concepts (cont.)
› Types of statistical hypotheses:
– The null hypothesis 𝐻0 is a statement of agreement with conditions presumed to be true in the population of interest. The null hypothesis is the hypothesis that is being tested.
– The alternative hypothesis 𝐻𝐴 is a statement of what we will believe is true if our sample data cause us to reject the null hypothesis.
– The null and alternative hypotheses are complementary. That is, the two together exhaust all possibilities regarding the value that the hypothesized parameter can assume.
© FAROUQ MOHAMMAD A. ALAM 5
Statistical Hypotheses for the Population Mean
© FAROUQ MOHAMMAD A. ALAM 6
Two-tailed test
Right-tailed test
Left-tailed test
Null
Hypothesis
𝑯
𝟎: 𝝁 = 𝝁
𝟎𝑯
𝟎: 𝝁 ≤ 𝝁
𝟎𝑯
𝟎: 𝝁 ≥ 𝝁
𝟎 AlternativeHypothesis
𝑯
𝟏: 𝝁 ≠ 𝝁
𝟎𝑯
𝟏: 𝝁 > 𝝁
𝟎𝑯
𝟏: 𝝁 < 𝝁
𝟎Statistical Hypotheses for the Population Proportion
© FAROUQ MOHAMMAD A. ALAM 7
Two-tailed test
Right-tailed test
Left-tailed test
Null
Hypothesis
𝑯
𝟎: 𝒑 = 𝒑
𝟎𝑯
𝟎: 𝒑 ≤ 𝒑
𝟎𝑯
𝟎: 𝒑 ≥ 𝒑
𝟎 AlternativeHypothesis
𝑯
𝟏: 𝒑 ≠ 𝒑
𝟎𝑯
𝟏: 𝒑 > 𝒑
𝟎𝑯
𝟏: 𝒑 < 𝒑
𝟎Test Statistics
› In the previous tables, 𝝁𝟎 and 𝒑𝟎 are called the hypothesized population mean and proportion, respectively.
› The test statistic is some relevant statistic that may be computed from the data of the sample which serves as a decision maker.
› The following is a general formula for a test statistic that will be applicable in many of the hypothesis tests:
𝐭𝐞𝐬𝐭 𝐬𝐭𝐚𝐭𝐢𝐬𝐭𝐢𝐜 = 𝐫𝐞𝐥𝐞𝐯𝐚𝐧𝐭 𝐬𝐭𝐚𝐭𝐢𝐬𝐭𝐢𝐜 − 𝐡𝐲𝐩𝐨𝐭𝐡𝐞𝐬𝐢𝐳𝐞𝐝 𝐩𝐚𝐫𝐚𝐦𝐞𝐭𝐞𝐫 𝐬𝐭𝐚𝐧𝐝𝐚𝐫𝐝 𝐞𝐫𝐫𝐨𝐫 𝐨𝐟 𝐭𝐡𝐞 𝐫𝐞𝐥𝐞𝐯𝐚𝐧𝐭 𝐬𝐭𝐚𝐭𝐢𝐬𝐭𝐢𝐜
© FAROUQ MOHAMMAD A. ALAM 8
Significance Level, Types of Errors, and the p-value
› The level of significance 𝜶 is the probability of rejecting a true null hypothesis.
› Type I error is the error committed when a true null hypothesis is rejected.
› Type II error is the error committed when a false null hypothesis is not rejected.
© FAROUQ MOHAMMAD A. ALAM 9
Significance Level, Types of Errors, and the p-value (cont.)
© FAROUQ MOHAMMAD A. ALAM 10
Significance Level, Types of Errors, and the p- value (cont.)
› The p-value is the smallest value for which we can reject a null hypothesis.
› If the p-value is more than 𝜶 (i.e., p-value > 𝜶), the decision is to not reject the null hypothesis.
› If the p-value is less than oe equal to 𝜶 (i.e., p-value ≤ 𝜶), the decision is to reject the null hypothesis.
© FAROUQ MOHAMMAD A. ALAM 11
7.2 HYPOTHESIS TESTING: A SINGLE POPULATION MEAN
© FAROUQ MOHAMMAD A. ALAM 12
Z–Test for the Population Mean
› The Z-test is a statistical test for the mean of a population. It can be used when the population is normally distributed, and the
population standard deviation (i.e., 𝜎) is known. The formula for the Z-test is
𝒁 = ഥ𝒙 − 𝝁𝟎 𝝈/ 𝒏
where ഥ
𝒙is the sample mean, 𝒏 is the sample size, and 𝝁
𝟎hypothesized population mean.
Z–Test for the Population Mean (cont.)
› If the sample size 𝒏 is larger (i.e., 𝒏 ≥ 𝟑𝟎), then according to the central limit theorem, the Z-test can be used when the population is normally distributed, and the population standard deviation is not known. The formula for the Z-test is
𝒁 = ഥ𝒙 − 𝝁𝟎 𝒔/ 𝒏
where ഥ
𝒙is the sample mean, 𝒔 is the sample standard
deviation, and 𝝁
𝟎is the hypothesized population mean. See
Example 7.2.4.
MegaStat Application
© FAROUQ MOHAMMAD A. ALAM 15
Example 7.2.1.
› Researchers are interested in the mean age of a certain
population. Let us say that they are asking the following question:
Can we conclude that the mean age of this population is different from 30 years? Suppose that the sample size is 10, sample mean is equal to 𝑥 = 27, while the population variance is 𝜎ҧ 2 = 20.
© FAROUQ MOHAMMAD A. ALAM 16
Example 7.2.1.
› Researchers are interested in the mean age of a certain
population. Let us say that they are asking the following question:
Can we conclude that the mean age of this population is different from 30 years? Suppose that the sample size is 10, sample
mean is equal to ഥ𝒙 = 𝟐𝟕, while the population variance is 𝝈𝟐 = 𝟐𝟎.
© FAROUQ MOHAMMAD A. ALAM 17
Z–Test for the Population Mean (MegaStat Application)
1. In Data Ribbon, click on MegaStat icon, then select Hypothesis Tests.
2. Select Proportion vs.
Hypothesized Value.
© FAROUQ MOHAMMAD A. ALAM 18
© FAROUQ MOHAMMAD A. ALAM 19
Write the following in Excel:
• (A1) =Age (variable label OR you can write anything here!)
• (A2) =27 (sample mean)
• (A3) =SQRT(20) (population standard deviation or sample standard deviation)
• (A4) =10 (sample size)
DO NOT CHANGE THIS ORDER!
© FAROUQ MOHAMMAD A. ALAM 20
6. Choose “z-test”.
5. Insert the value of the Hypothesized
mean.
4. Input range of information.
3. Choose summary input.
7. Press OK.
© FAROUQ MOHAMMAD A. ALAM 21
The value of the test
statistics.
The p-value which is less than 𝜶 = 𝟎. 𝟎𝟓
Example 7.2.1. (cont.)
› Since the p-value is less that 𝜶 = 𝟎. 𝟎𝟓, the null hypothesis is rejected at the 0.05 level of significance. In other words, the results were not significant at the 0.05 level.
› Hence, one can conclude that the population mean is not equal to 30.
© FAROUQ MOHAMMAD A. ALAM 22
Example 7.2.2.
› Researchers are interested in the mean age of a certain
population. Let us say that they are asking the following question:
Can we conclude that the mean age of this population is less than 30 years? Suppose that the sample size is 10, sample mean is
equal to 𝑥 = 27, while the population variance is 𝜎ҧ 2 = 20.
© FAROUQ MOHAMMAD A. ALAM 23
Example 7.2.2.
› Researchers are interested in the mean age of a certain
population. Let us say that they are asking the following question:
Can we conclude that the mean age of this population is less than 30 years? Suppose that the sample size is 10, sample mean is equal to ഥ𝒙 = 𝟐𝟕, while the population variance is 𝝈𝟐 = 𝟐𝟎.
© FAROUQ MOHAMMAD A. ALAM 24
© FAROUQ MOHAMMAD A. ALAM 25
6. Choose “z-test”.
5. Insert the value of the hypothesized mean.
4. Input range of information.
3. Choose summary input.
6. Choose “greater than”, then press
OK.
7. Choose “less than”, then press
OK.
© FAROUQ MOHAMMAD A. ALAM 26
The value of the test
statistics.
The p-value which is less than 𝜶 = 𝟎. 𝟎𝟓
Example 7.2.2. (cont.)
› Since the p-value is less that 𝜶 = 𝟎. 𝟎𝟓, the null hypothesis is rejected at the 0.05 level of significance.
› Hence, one can conclude that the population mean is less than 30.
© FAROUQ MOHAMMAD A. ALAM 27
7.5 HYPOTHESIS TESTING: A SINGLE POPULATION PROPORTION
© FAROUQ MOHAMMAD A. ALAM 28
Z–Test for the Population Mean
› If the sample size 𝒏 is larger (i.e., 𝒏 ≥ 𝟑𝟎), then according to the central limit theorem, the Z-test can be considered for the
statistical hypotheses of the population proportion. The formula for the Z-test is
𝒁 = 𝒑 − 𝒑ෝ 𝟎
𝒑𝟎(𝟏 − 𝒑𝟎) 𝒏
where
Ƹ𝑝is the sample proportion, and
𝒑𝟎is the hypothesized
population proportion.
MegaStat Application
© FAROUQ MOHAMMAD A. ALAM 30
Example 7.5.1
› Data of 301 Hispanic women were collected. One
variable of interest was the percentage of subjects with impaired fasting glucose (IFG). In the study, 24 women were classified in the IFG stage. Is there sufficient
evidence to indicate that the population of Hispanic women has a prevalence of IFG higher than 6.3% ?
© FAROUQ MOHAMMAD A. ALAM 31
Example 7.5.1
› Data of 301 Hispanic women were collected. One
variable of interest was the percentage of subjects with impaired fasting glucose (IFG). In the study, 24 women were classified in the IFG stage. Is there sufficient
evidence to indicate that the population of Hispanic women has a prevalence of IFG higher than 6.3% ?
© FAROUQ MOHAMMAD A. ALAM 32
Example 7.5.1
The statistical hypotheses are:
𝑯𝟎: 𝒑 ≥ 𝟎. 𝟎𝟔𝟑 vs. 𝑯𝟏: 𝒑 < 𝟎. 𝟎𝟔𝟑
The information form the sample are as follows:
𝒏 = 𝟑𝟎𝟏, 𝒑 =ෝ 𝟐𝟒
𝟑𝟎𝟏 ≈ 𝟎. 𝟎𝟖𝟎 and 𝒑𝟎 = 𝟎. 𝟎𝟔𝟑.
© FAROUQ MOHAMMAD A. ALAM 33
Z–Test for the Population Proportion (MegaStat Application)
1. In Data Ribbon, click on MegaStat icon, then select Hypothesis Tests.
2. Select Proportion vs.
Hypothesized Value.
© FAROUQ MOHAMMAD A. ALAM 34
© FAROUQ MOHAMMAD A. ALAM 35
6. Choose “greater than”.
4. Insert the value of the population
proportion.
5. Insert the sample size.
3. Insert the value of the sample proportion.
6. Choose “greater than”, then press
OK.
© FAROUQ MOHAMMAD A. ALAM 36
The value of the test
statistics.
The p-value which is greater than 𝜶 = 𝟎. 𝟎𝟓
Example 7.5.1
The statistical hypotheses are:
𝑯𝟎: 𝒑 ≥ 𝟎. 𝟎𝟔𝟑 vs. 𝑯𝟏: 𝒑 < 𝟎. 𝟎𝟔𝟑
The information form the sample are as follows:
𝒏 = 𝟑𝟎𝟏, 𝒑 =ෝ 𝟐𝟒
𝟑𝟎𝟏 ≈ 𝟎. 𝟎𝟖𝟎 and 𝒑𝟎 = 𝟎. 𝟎𝟔𝟑.
› Since the p-value > 𝜶, then there is not enough evidence to reject 𝑯𝟎. Thus, we cannot conclude that in the population the proportion who are IFG is higher than 6.3 percent.
© FAROUQ MOHAMMAD A. ALAM 37