Materi Kuliah rancangan Penelaitian S2 Teknik Lingkungan ITS

(1)

Basic Statistical

Concepts

Sutikno

Departemen Statistika

Fakultas Matematika dan Ilmu Pengetahuan Alam Institut Teknologi Sepuluh Nopember Surabaya

tikno@yahoo.com,

sutikno@statistika.its.ac.id

(2)

2

AGENDA

• _Simple

_comparative

_experiments

– _{The hypothesis testing framework} – _{The two-sample}_t_-test

– _{Checking assumptions, validity}

• _{Comparing more that two factor levels…}

the

analysis of variance

– _{ANOVA decomposition of total variability} – _{Statistical testing & analysis}

– _{Checking assumptions, model validity} – _{Post-ANOVA testing of means}

(3)

(4)

4

Graphical View of the Data

(5)

5

(6)

6

The Hypothesis Testing Framework

• _{Statistical hypothesis testing}

_{is a}

useful framework for many

experimental situations

• _{Origins of the methodology date from}

the early 1900s

• _{We will use a procedure known as the}

(7)

7

The Hypothesis Testing

Framework

• _{Sampling from a}

_normal

_distribution

• _{Statistical hypotheses:}

(8)

8

Estimation of Parameters

1

2 2 2

1

1 estimates the population mean

(9)

(10)

10

Use the sample means to draw inferences about the population means 16.76 17.04 0.28

Difference in sample means

Standard deviation of the difference in sample means

This suggests a statistic:

(11)

11

The previous ratio becomes

However, we have the case where

Pool the individual sample variances:

(12)

12

How the Two-Sample

t

-Test

Works:

• _{Values of}_t₀_{that are near zero are consistent}

with the null hypothesis

• _{Values of}_t₀_{that are very diferent from zero}

are consistent with the alternative hypothesis

• _t₀_{is a “distance” measure-how far apart the}

averages are expressed in standard deviation units

The test statistic is

(13)

The Two-Sample (Pooled)

t

-The two sample means are a little over two standard deviations apart Is t

(14)

14

The Two-Sample (Pooled)

t

-Test

• So far, we haven’t really done any “statistics”

• We need an objective

basis for deciding how large the test statistic

t0 really is

• In 1908, W. S. Gosset derived the reference distribution for t0 … called the t distribution

• Tables of the t

distribution - text, page 606

(15)

15

The Two-Sample (Pooled)

t

-Test

• _{A value of}_t₀

between –2.101 and 2.101 is

consistent with equality of means

• It is possible for the means to be equal and t0 to exceed either 2.101 or – 2.101, but it would be a “rare event” … leads to the

conclusion that the means are diferent

• Could also use the

P-value approach

(16)

16

The Two-Sample (Pooled)

t

-Test

• The P-value is the risk of wrongly rejecting the null hypothesis of equal means (it measures rareness of the event)

• The P-value in our problem is P = 0.042

(17)

(18)

18

Checking Assumptions –

(19)

19

Importance of the

t

-Test

• _{Provides an}

_objective

_{framework for}

simple comparative experiments

• _{Could be used to test all relevant}

hypotheses in a two-level factorial

design, because all of these

hypotheses involve the mean

(20)

20

Confdence Intervals

• _{Hypothesis testing gives an objective}

statement concerning the diference in

means, but it doesn’t specify “how

diferent” they are

• _General

_form

_{of a confdence interval}

• _{The 100(1- α)%}

_confdence

_interval

on the diference in two means:

(21)

21

What If There Are More Than Two Factor Levels?

• _The_t_{-test does not directly apply}

• _{There are lots of practical situations where there}

are either more than two levels of interest, or

there are several factors of simultaneous interest

• _The_{analysis of variance}_{(ANOVA) is the}

appropriate analysis “engine” for these types of experiments – Chapter 3, textbook

• _{The ANOVA was developed by Fisher in the early}

1920s, and initially applied to agricultural experiments

(22)

DOX 6E Montgomery 22

An Example

• An engineer is interested in investigating the

relationship between the RF power setting and the etch rate for this tool. The objective of an experiment like this is to model the relationship between etch rate and RF power, and to specify the power setting that will give a desired target etch rate.

• The response variable is etch rate.

• She is interested in a particular gas (C2F6) and gap (0.80 cm), and wants to test four levels of RF power:

160W, 180W, 200W, and 220W. She decided to test fve wafers at each level of RF power.

• The experimenter chooses 4 levels of RF power 160W, 180W, 200W, and 220W

(23)

23

An Example

• _Does_changing

the power

change the mean etch rate?

• _{Is there an}

(24)

24

The Analysis of Variance

• In general, there will be a levels of the factor, or a

treatments, and n replicates of the experiment, run in

random order…a completely randomized design (CRD)

• N = an total runs

• We consider the fxed efects case…the random efects

case will be discussed later

• Objective is to test hypotheses about the equality of the a

(25)

25

The Analysis of Variance

• _{The name “analysis of variance” stems}

from a

partitioning

of the total

variability in the response variable into

components that are consistent with a

model

for the experiment

• _{The basic single-factor ANOVA model is}

2

1, 2,...,

,

1, 2,...,

an overall mean,

treatment effect,

experimental error,

(0,

)

(26)

26

Models for the Data

There are several ways to write a

model for the data:

_{is called the effects model}

Let

, then

is called the means model

Regression models can also be employed

(27)

27

The Analysis of Variance

• _{Total variability}

_{is measured by the}

total sum of squares:

• _{The basic ANOVA partitioning is:}

2

T Treatments E

(28)

28

The Analysis of Variance

• _{A large value of}_SS_Treatments_{refects large}

diferences in treatment means

• _{A small value of}_SS_Treatments_{likely indicates no}

diferences in treatment means

• _{Formal statistical hypotheses are:} T Treatments E

(29)

29

The Analysis of Variance

• _{While sums of squares cannot be directly}

compared to test the hypothesis of equal

means, mean squares can be compared.

• _{A mean square is a sum of squares divided by}

its degrees of freedom:

• _{If the treatment means are equal, the}

treatment and error mean squares will be (theoretically) equal.

• _{If treatment means difer, the treatment mean}

square will be larger than the error mean square.

1

1 (

1)

,

1 (

1)

Total Treatments Error

(30)

30

The Analysis of Variance is

Summarized in a Table

• Computing…see text, pp 66-70

• The reference distribution for F0 is the Fa-1, a(n-1) distribution • Reject the null hypothesis (equal treatment means) if

0 ,a 1, (a n 1)

(31)

31

(32)

32

(33)

33

ANOVA calculations are usually done via computer

• _{Text exhibits sample calculations}

from two very popular software

packages, Design-Expert and Minitab

• _{See page 99 for Design-Expert, page}

100 for Minitab

• _{Text discusses some of the summary}

(34)

34

Model Adequacy Checking in the ANOVA

• _{Checking assumptions}_{is important}

• _Normality

• _{Constant variance}

• _Independence

• _{Have we ft the right model?}

• _{Later we will talk about what to do if some of}

(35)

35

Model Adequacy Checking in the

ANOVA

• _{Examination of}

residuals (see text, Sec. 3-4, pg. 75)

• _{Design-Expert}

generates the residuals

• _{Residual plots}_are

very useful

• _{Normal probability}

plot of residuals

(36)

36

(37)

37

Post-ANOVA Comparison of Means

• The analysis of variance tests the hypothesis of equal treatment means

• Assume that residual analysis is satisfactory

• If that hypothesis is rejected, we don’t know

which specifc means are diferent

• _{Determining which specifc means difer following}

an ANOVA is called the multiple comparisons problem

• _{There are}_lots_{of ways to do this…see text,}

Section 3-5, pg. 87

• _{We will use pairwise}_t_{-tests on means…}

(38)

38

(39)

39

(40)

40

(41)

Why Does the ANOVA Work?

2 2

1 0 ( 1)

2 2

0

We are sampling from normal populations, so

if

is true, and

Cochran's theorem gives the independence of

these two chi-square random variables

/(

Therefore an upper-tail test is appropriate.

(42)

42

Sample Size Determination

• FAQ in designed experiments

• Answer depends on lots of things; including what type of experiment is being

contemplated, how it will be conducted, resources, and desired sensitivity

• _{Sensitivity refers to the}_{diference in means}

that the experimenter wishes to detect

• _Generally,_increasing_{the number of}

(43)

Sample Size Determination

Fixed Efects Case

• _{Can choose the sample size to detect a}

specifc diference in means and achieve

desired values of

type I and type II

• _{Operating characteristic curves}

_plot

(44)

Sample Size Determination

Fixed Efects Case---use of OC

Curves

• _The_OC _curves_{for the fxed efects model are}

in the A very common way to use these charts is to defne a diference in two means D of

interest, then the minimum value of is

• _{Typically work in term of the ratio of and}

try values of n until the desired power is achieved

• _{Minitab will perform power and sample size}

calculations

• _{There are some other methods discussed in the}

(45)

45