Chapter 7

(1)

HYPOTHESIS TESTING

(2)

Learning Outcomes

› After studying this chapter, the student will:

1. understand how to correctly state a null and alternative hypothesis and carry out a structured hypothesis test.

2. understand the concepts of type I error, and type II error of a test.

3. understand how to calculate and interpret p-values.

(3)

7.1. INTRODUCTION

(4)

Basic Concepts

› A hypothesis may be defined simply as a statement about one or more populations.

› Types of hypotheses:

– The research hypothesis is the conjecture or supposition that motivates the research.

– Statistical hypotheses are hypotheses that are stated in such a way that they may be evaluated by appropriate statistical techniques.

(5)

Basic Concepts (cont.)

› Types of statistical hypotheses:

– The null hypothesis 𝐻₀ is a statement of agreement with conditions presumed to be true in the population of interest. The null hypothesis is the hypothesis that is being tested.

– The alternative hypothesis 𝐻_𝐴 is a statement of what we will believe is true if our sample data cause us to reject the null hypothesis.

– The null and alternative hypotheses are complementary. That is, the two together exhaust all possibilities regarding the value that the hypothesized parameter can assume.

(6)

Statistical Hypotheses for the Population Mean

Two-tailed test

Right-tailed test

Left-tailed test

Null

Hypothesis

𝑯

_𝟎

: 𝝁 = 𝝁

_𝟎

𝑯

_𝟎

: 𝝁 ≤ 𝝁

_𝟎

𝑯

_𝟎

: 𝝁 ≥ 𝝁

_𝟎 Alternative

Hypothesis

𝑯

_𝟏

: 𝝁 ≠ 𝝁

_𝟎

𝑯

_𝟏

: 𝝁 > 𝝁

_𝟎

𝑯

_𝟏

: 𝝁 < 𝝁

_𝟎

(7)

Statistical Hypotheses for the Population Proportion

Two-tailed test

Right-tailed test

Left-tailed test

Null

Hypothesis

𝑯

_𝟎

: 𝒑 = 𝒑

_𝟎

𝑯

_𝟎

: 𝒑 ≤ 𝒑

_𝟎

𝑯

_𝟎

: 𝒑 ≥ 𝒑

_𝟎 Alternative

Hypothesis

𝑯

_𝟏

: 𝒑 ≠ 𝒑

_𝟎

𝑯

_𝟏

: 𝒑 > 𝒑

_𝟎

𝑯

_𝟏

: 𝒑 < 𝒑

_𝟎

(8)

Test Statistics

› In the previous tables, 𝝁_𝟎 and 𝒑_𝟎 are called the hypothesized population mean and proportion, respectively.

› The test statistic is some relevant statistic that may be computed from the data of the sample which serves as a decision maker.

› The following is a general formula for a test statistic that will be applicable in many of the hypothesis tests:

𝐭𝐞𝐬𝐭 𝐬𝐭𝐚𝐭𝐢𝐬𝐭𝐢𝐜 = 𝐫𝐞𝐥𝐞𝐯𝐚𝐧𝐭 𝐬𝐭𝐚𝐭𝐢𝐬𝐭𝐢𝐜 − 𝐡𝐲𝐩𝐨𝐭𝐡𝐞𝐬𝐢𝐳𝐞𝐝 𝐩𝐚𝐫𝐚𝐦𝐞𝐭𝐞𝐫 𝐬𝐭𝐚𝐧𝐝𝐚𝐫𝐝 𝐞𝐫𝐫𝐨𝐫 𝐨𝐟 𝐭𝐡𝐞 𝐫𝐞𝐥𝐞𝐯𝐚𝐧𝐭 𝐬𝐭𝐚𝐭𝐢𝐬𝐭𝐢𝐜

(9)

Significance Level, Types of Errors, and the p-value

› The level of significance 𝜶 is the probability of rejecting a true null hypothesis.

› Type I error is the error committed when a true null hypothesis is rejected.

› Type II error is the error committed when a false null hypothesis is not rejected.

(10)

Significance Level, Types of Errors, and the p-value (cont.)

(11)

Significance Level, Types of Errors, and the p- value (cont.)

› The p-value is the smallest value for which we can reject a null hypothesis.

› If the p-value is more than 𝜶 (i.e., p-value > 𝜶), the decision is to not reject the null hypothesis.

› If the p-value is less than oe equal to 𝜶 (i.e., p-value ≤ 𝜶), the decision is to reject the null hypothesis.

(12)

7.2 HYPOTHESIS TESTING: A SINGLE POPULATION MEAN

(13)

Z–Test for the Population Mean

› The Z-test is a statistical test for the mean of a population. It can be used when the population is normally distributed, and the

population standard deviation (i.e., 𝜎) is known. The formula for the Z-test is

𝒁 = ഥ𝒙 − 𝝁_𝟎 𝝈/ 𝒏

where ഥ

𝒙

is the sample mean, 𝒏 is the sample size, and 𝝁

_𝟎

hypothesized population mean.

(14)

Z–Test for the Population Mean (cont.)

› If the sample size 𝒏 is larger (i.e., 𝒏 ≥ 𝟑𝟎), then according to the central limit theorem, the Z-test can be used when the population is normally distributed, and the population standard deviation is not known. The formula for the Z-test is

𝒁 = ഥ𝒙 − 𝝁_𝟎 𝒔/ 𝒏

where ഥ

𝒙

is the sample mean, 𝒔 is the sample standard

deviation, and 𝝁

_𝟎

is the hypothesized population mean. See

Example 7.2.4.

(15)

MegaStat Application

(16)

Example 7.2.1.

› Researchers are interested in the mean age of a certain

population. Let us say that they are asking the following question:

Can we conclude that the mean age of this population is different from 30 years? Suppose that the sample size is 10, sample mean is equal to 𝑥 = 27, while the population variance is 𝜎ҧ ² = 20.

(17)

Example 7.2.1.

Can we conclude that the mean age of this population is different from 30 years? Suppose that the sample size is 10, sample

mean is equal to ഥ𝒙 = 𝟐𝟕, while the population variance is 𝝈^𝟐 = 𝟐𝟎.

(18)

Z–Test for the Population Mean (MegaStat Application)

1. In Data Ribbon, click on MegaStat icon, then select Hypothesis Tests.

2. Select Proportion vs.

Hypothesized Value.

(19)

Write the following in Excel:

• (A1) =Age (variable label OR you can write anything here!)

• (A2) =27 (sample mean)

• (A3) =SQRT(20) (population standard deviation or sample standard deviation)

• (A4) =10 (sample size)

DO NOT CHANGE THIS ORDER!

(20)

6. Choose “z-test”.

5. Insert the value of the Hypothesized

mean.

4. Input range of information.

3. Choose summary input.

7. Press OK.

(21)

The value of the test

statistics.

The p-value which is less than 𝜶 = 𝟎. 𝟎𝟓

(22)

Example 7.2.1. (cont.)

› Since the p-value is less that 𝜶 = 𝟎. 𝟎𝟓, the null hypothesis is rejected at the 0.05 level of significance. In other words, the results were not significant at the 0.05 level.

› Hence, one can conclude that the population mean is not equal to 30.

(23)

Example 7.2.2.

Can we conclude that the mean age of this population is less than 30 years? Suppose that the sample size is 10, sample mean is

equal to 𝑥 = 27, while the population variance is 𝜎ҧ ² = 20.

(24)

Example 7.2.2.

Can we conclude that the mean age of this population is less than 30 years? Suppose that the sample size is 10, sample mean is equal to ഥ𝒙 = 𝟐𝟕, while the population variance is 𝝈^𝟐 = 𝟐𝟎.

(25)

6. Choose “z-test”.

5. Insert the value of the hypothesized mean.

4. Input range of information.

3. Choose summary input.

6. Choose “greater than”, then press

OK.

7. Choose “less than”, then press

OK.

(26)

statistics.

The p-value which is less than 𝜶 = 𝟎. 𝟎𝟓

(27)

Example 7.2.2. (cont.)

› Since the p-value is less that 𝜶 = 𝟎. 𝟎𝟓, the null hypothesis is rejected at the 0.05 level of significance.

› Hence, one can conclude that the population mean is less than 30.

(28)

7.5 HYPOTHESIS TESTING: A SINGLE POPULATION PROPORTION

(29)

Z–Test for the Population Mean

› If the sample size 𝒏 is larger (i.e., 𝒏 ≥ 𝟑𝟎), then according to the central limit theorem, the Z-test can be considered for the

statistical hypotheses of the population proportion. The formula for the Z-test is

𝒁 = 𝒑 − 𝒑ෝ _𝟎

𝒑_𝟎(𝟏 − 𝒑_𝟎) 𝒏

where

Ƹ𝑝

is the sample proportion, and

𝒑_𝟎

is the hypothesized

population proportion.

(30)

MegaStat Application

(31)

Example 7.5.1

› Data of 301 Hispanic women were collected. One

variable of interest was the percentage of subjects with impaired fasting glucose (IFG). In the study, 24 women were classified in the IFG stage. Is there sufficient

evidence to indicate that the population of Hispanic women has a prevalence of IFG higher than 6.3% ?

(32)

Example 7.5.1

› Data of 301 Hispanic women were collected. One

variable of interest was the percentage of subjects with impaired fasting glucose (IFG). In the study, 24 women were classified in the IFG stage. Is there sufficient

evidence to indicate that the population of Hispanic women has a prevalence of IFG higher than 6.3% ?

(33)

Example 7.5.1

The statistical hypotheses are:

𝑯_𝟎: 𝒑 ≥ 𝟎. 𝟎𝟔𝟑 vs. 𝑯_𝟏: 𝒑 < 𝟎. 𝟎𝟔𝟑

The information form the sample are as follows:

𝒏 = 𝟑𝟎𝟏, 𝒑 =ෝ ^𝟐𝟒

𝟑𝟎𝟏 ≈ 𝟎. 𝟎𝟖𝟎 and 𝒑_𝟎 = 𝟎. 𝟎𝟔𝟑.

(34)

Z–Test for the Population Proportion (MegaStat Application)

1. In Data Ribbon, click on MegaStat icon, then select Hypothesis Tests.

2. Select Proportion vs.

Hypothesized Value.

(35)

6. Choose “greater than”.

4. Insert the value of the population

proportion.

5. Insert the sample size.

3. Insert the value of the sample proportion.

6. Choose “greater than”, then press

OK.

(36)

statistics.

The p-value which is greater than 𝜶 = 𝟎. 𝟎𝟓

(37)

Example 7.5.1

The statistical hypotheses are:

𝑯_𝟎: 𝒑 ≥ 𝟎. 𝟎𝟔𝟑 vs. 𝑯_𝟏: 𝒑 < 𝟎. 𝟎𝟔𝟑

The information form the sample are as follows:

𝒏 = 𝟑𝟎𝟏, 𝒑 =ෝ ^𝟐𝟒

𝟑𝟎𝟏 ≈ 𝟎. 𝟎𝟖𝟎 and 𝒑_𝟎 = 𝟎. 𝟎𝟔𝟑.

› Since the p-value > 𝜶, then there is not enough evidence to reject 𝑯_𝟎. Thus, we cannot conclude that in the population the proportion who are IFG is higher than 6.3 percent.