THE MULTINOMIAL DISTRIBUTION AND ELEMENTARY TESTS FOR CATEGORICAL DATA

(1)

THE MULTINOMIAL DISTRIBUTION

AND ELEMENTARY TESTS FOR CATEGORICAL DATA

It is useful to have a probability model for the number of observations falling into each of k mutually exclusive classes. Such a model is given by the multinomial random variable, for which it is assumed that :

1. A total for n independent trials are made

2. At each trial an observation will fall into exactly one of k

mutually exclusive classes

3. The probabilities of falling into the k classes are

p1, p2,……., pk where pi is the probability of falling into

class i, i = 1,2,…k

(2)

If k =2 , we have the Binomial distribution.

Let us defne :

X1 to be the number of type 1 outcomes in the n trials,

X2 to be the number of type 2 outcomes,

. .

(3)

The joint probability function for these RV can be shown to be :

where

For k=2, the probability function reduces to

which is the Binomial probability of - successes in n trials, each with probability of success .

(4)

EXAMPLE

A simple example of multinomial trials is the tossing of a die n times. At each trial the outcome is one of the values 1, 2, 3, 4, 5 or 6. Here k=6. If n=10 , the probability of 2 ones, 2 twos, 2 threes, no fours, 2 fves and 2 sixes is :

To testing hypotheses concerning the , the null hypothesis for this example,

(5)

The left-hand side can be thought of as the sum of the terms :

Which will be used in testing

versus

where the are hypothesized value of the

(6)

In the special case of k=2, there are two-possible outcomes at each trial, which can be called success and failure.

A test of is a test of the same null hypothesis

( ). The following are observed an expected values for this situation :

Success Failure Total

Expected n

(7)

For an α-level test, a rejection region for testing versus

is given by We know that

Hence,

By defnition,

We have, , and using

if and only if

(8)

GOODNESS - of – FIT TESTS

Thus far all our statistical inferences have involved population parameters like : means, variances and proportions. Now we make inferences about the entire population distribution. A sample is taken, and we want to test a null hypothesis of the general form ;

H0 : sample is from a specifee eistribution

The alternative hypothesis is always of the form

H1 : sample is not from a specifee eistribution

A test of H0 versus H1 is called a goodness-of-fi iesi.

Two tests are used to evaluate goodness of ft :

1. The test, which is based on an approximate statistic. 2. The Kolmogorov – Smirnov (K-S) test.

This is called a non parametric test, because it uses a test statistic that

makes no assumptions about distribution.

(9)

Gooeness of Fit ??

A goodness of ft test attempts to determine if a conspicuous discrepancy exists between the observed cell frequencies and

those expected under H₀ .

A useful measure for the overall discrepancy is given by :

where O and E symbolize an observed frequency and the corresponding expected frequency.

The discrepancy in each cell is measured by the squared diference between the observed and the expected frequencies divided by the expected frequency.

(10)

The statistic was originally proposed by Karl Pearson (1857 – 1936) , who found the distribution for large n to be approximately a distribution with degrees of freedom = k-1.

Due to this distribution, the statistic is denoted by and is called Pearson’s

statistic for goodness of ft .

Null hypothesis : H0 : pi = pio ; i = 1,2, ….k

H1 : at least one pi is not equal to its

specifed value.

Test statistic :

(11)

Chi – square statistic frst proposed by Karl Pearson in 1900, begin with the Binomial case.

Let X1 ~ BIN (n, p1) where 0 < p1 < 1.

According to the CLT :

for large n, particularly when np1 ≥ 5 and n(1- p1) ≥ 5.

As you know, that Q₁ = Z2 ≈ χ2 (1)

If we let X2 = n - X1 and p2 = 1 - p1 ,

Because, Hence,

(12)

Pearson the constructed an expression similar to Q1 ; which involves

X1 and

X2 = n - X1 , that we denote by Qk-1 , involving X1 , X2 , ……., Xk-1 and

Xk = n - X1 - X2 - …….- Xk-1

Hence,

(13)

EXAMPLE

We observe n = 85 values of a – random variable X that is thought to have a Poisson distribution, obtaining :

(14)

The expected frequency for the cell { 3, 4, 5 } is :

85 (0,047) = 4,0 ; WHY ???

The computed Q3 , with k=4 after combination,

 no reason to reject H0

H0 : sample is from Poisson distribution

(15)

EXERCISE

The number X of telephone calls received each minute at a certain switch board in the middle of a working day is thought to have a Poisson distribution.

Data were collected, and the results were as follows :

Fit a Poisson distribution. Then fnd the estimated expected value of each cell after combining {4,5,6} to make one cell.

Compute Q4 , since k=5, and compare it to

Why do we use three degrees of freedom?

Do we accept or reject the Poisson distribution?

x 0 1 2 3 4 5 6

frequenc

y 40 66 41 28 9 3 1

(16)

CONTINGENCY TABLES

In many cases, data can be classifed into categories on the basis of two criteria.

For example, a radio receiver may be classifed as having low, average, or high fdelity and as having low, average, or high selectivity; or graduating engineering students may be classifed according to their starting salary and their grade-point-average.

In a contingency table, the statistical question is whether the

row criteria and column criteria are independent.

The null and alternative hypotheses are

H0 : The row ane column criteria are ineepeneent

H1 : The row ane column criteria are associatee

(17)

The row sum for the ith row is

And the column sum for jth column is

The total number of observations in the entire table is

The contingency table for the general case is given ON THE NEXT SLIDESHOW :

(18)

(19)

There are several probabilities of importance associated with the table.

The probability of an element’s being in row class i and column class j in the population is denoted by pij

The probability of being in row class i is denoted by pi• ,

and the probability of being in column class j is denoted by p•j

Null and alternative hypotheses regarding the

(20)

and

under the hypothesis of independence,

, so would be estimated by

(21)

The actual critical region is given by

If the computed gets too large,

namely, exceeds

we reject the hypothesis

that the two attributes are independent.

(22)

EXAMPLE

(23)

SOLUTION

;

APA ARTINYA ???

(24)

EXERCISES

1. Test of the fdelity and the selectivity of 190 radios produced. The results shown in the following

table :

Use the 0,01 level of signifcance to test the null hypothesis that fdelity is independent of

7 12 31

35 59 18

(25)

2. A test of the quality of two or more multinomial distributions can be made by using calculations that are associated with a contingency table. For

(26)

Clearly, we want to test the equality of three

multinomial distributions, each with k=4 cells. Since under the probability of falling into a particular grade category is independent of brand, we can

test this hypothesis by computing and comparing it with .

(27)

ANALYSIS OF VARIANCE

19 27

The Analysis of Variance

ANOVA (AOV)

is generalization of the two sample t-test, so that the means of

k > 2 populations may be compared

ANalysis Of VAriance, frst suggested by Sir Ronald Fisher,

pioneer of the theory of design of experiments.

(28)

The name Analysis of Variance stems from the somewhat surprising fact that a set of computations on several variances is used to test the equality of several means

(29)

29

ANOV

A

ANOV

A

The term ANOVA appears to be a misnomer, since the objective is to analyze diferences

among the group means

The ANOVA belies its name in that it is not concerned with analyzing variances but rather

(30)

ANOV

the hypothesis that several populations have the same means.

DEFINITION:

ANOVA, or one-factor analysis of variance, is a procedure to test the hypothesis that several populations have the same means.

FUNCTION:

Using analysis of variance, we will be able to make inferences about whether our samples are drawn from populations having the same means

FUNCTION:

(31)

INTRODUCTION

31

The Analysis of Variance (ANOVA) is a statistical technique used to

compare the locations (specifcally, the expectations) of k>2 populations. The study of ANOVA involves the investigation of very complex statistical models, which are interesting both statistically and mathematically.

The frst is referred to as a one-way classifcation or a completely randomized design.

The second is called a two-way classifcation or a randomized block design.

The basic idea behind the term “ANOVA” is that the total variability of all the observations can be separated into distinct portions, each of which can be assigned a particular source or cause.

(32)

Suppose that we are interested in k populations, from each of which we sample n observations. The

observations are denoted by:

Yij , i = 1,2,…k ; j = 1,2,…n

where Yij represents the jth observation from

population i.

A basic null hypothesis to test is :

H0 : µ1 = µ2 = … =µk

that is , all the populations have the same expectation.

(33)

THE COMPLETELY RANDOMIZED DESIGN WITH EQUAL SAMPLE SIZES

33

First we will consider comparison of the true expectation of

k > 2 populations, sometimes referred to as the k – sample problem.

For simplicity of presentation, we will assume initially that an equal number of observations are randomly sampled from each population. These observations are denoted by:

Y11 , Y12 , …… , Y1n

Y21 , Y22 , …… , Y2n

. . .

(34)

where Yij represents the jth observation out of the n

randomly sampled observations from the ith population.

Hence, Y12 would be the second observation from the frst

population.

In the completely randomized design, the observations are assumed to :

1. Come from normal populations

2. Come from populations with the same variance

3. Have possibly diferent expectations, µ1 , µ2 , … , µk

These assumptions are expressed mathematically as follows :

Yij ~ NOR (µi , σ2) ; i = 1,2,...k (*)

(35)

35

Y_ij = µ_i + ε_ij , with ε_ij~ NID (0, σ2)

Where N represents “normally”, I represents “ independently” and D represents “ distributed”.

The 0 means for all pairs of indices i and j, and σ2 means that Var ( ) = _σ2 for all such

pairs.

The parameters µ₁, µ₂ , … ,µ_k are the expectations of the k populations, about which inference is to be

made.

The initial hypotheses to be tested in the completely randomized design are :

H₀: µ₁= µ₂ = … =µ_k

versus

(36)

The null hypothesis states that all of the k populations have the same expectation. If this is true, then we know from equation (*) that all of the Yij observations have the same normal distribution and we are

observing not n observations from each of k populations, but nk observations, all from the same population.

The random variable Yij may be written as :

where,

defning,

(37)

37

Hence, , with

and

The hypotheses in equation (**) may be restated as :

VS

(***)

The observation has expectation,

(38)

The parameters are diferences or deviations from this common part of the individual

population expectations . If all of the are equal (say to ), then . In this case all of the deviation are zero, because :

Hence, the wall hypothesis in equation (***) means that, , these expectations consist only of the common part .

The total variability of the observations :

, where , is the means of all of the observations.

(39)

39

The notation represents the average of the observations

from the ith population ; that is

The last equation, is represented by : SST = SSA + SSE

where SST represents the total sum of squares, SSA

represents the sum of squares due to diferences among

populations or ireaimenis, and SSE represents the sum of

squares that is unexplained or said to be “ due to error”. The result of ANOVA, usually reported in an analysis of variance table.

(40)

ANOVA Table for the Completely Raneomizee Design with Equal Sample Sizes :

Source of

Variation Degrees of Freeeom SquaresSum of Mean Square F

Among

populations or

treatments

k-1 SSA

Error k(n-1) SSE

Total kn-1 SST

For an -level test, a reasonable critical region for the

(41)

THE COMPLETELY RANDOMIZED DESIGN WITH UNEQUAL SAMPLE SIZES

41

In many studies in which expectation of k>2 populations are

compared, the samples from each population are not ultimately of equal size, even in cases where we attempt to maintain equal

sample size. For example, suppose we decide to compare three teaching methods using three classes of students.

The teachers of the classes agree to teach use one of the three teaching methods.

The plan for the comparison is to give a common examination to all of the students in each class after two months of instruction.

(42)

In the case of UNEQUAL SAMPLE SIZE, the observations are denoted by :

. . .

where, represents the jth observation from the ith population.

For the ith population there are n_i observations.

In the case of equal sample sizes, ni = n for i = 1,2,…,k.

The model assumptions are the same for the unequal sample size case as for the equal sample size case. The are assumed to :

1. Come from normal populations

(43)

43

These assumptions are expressed formally as ; i = 1, 2, …, k

j = 1, 2, …, ni

or as Yij = µi + εij , with εij ~ NID (0, σ2)

The frst null and alternative hypotheses to test are exactly the same as those in the previous section-namely :

H0 : µ1 = µ2 = … =µk

versus

H1 : µi ≠ µl for some pair of indices i ≠ l

The model for the completely randomized design may be presented as :

with and εij ~ NID (0, σ2)

In this case the overall mean, , is given by

where is the total number of

(44)

Here is a weighted average of the population expectations , where the weights

are , the proportion of observations coming from the ith

population.

The hypotheses, can also be restated as

versus

for at least one i.

The observation Yij has expectation ,

If H0 is true, then , hence all of the have a

common distribution.

Thus, , under H0. The total variability of the

observations is again partitioned into two portions by

(45)

45

where

As before : represents the average of the observations from

the ith_population.

N is the total number of observations

is the average of all the observations

Again, SST represents the total sum of squares.

SSA represents the sum of squares due to diferences

among populations or treatments.

(46)

The number of Degrees of Freedom for :

TOTAL = TREATMENTS + ERROR

(N-1) = (k-1) + (N-k)

(47)

47

The mean square among treatments and the mean square for error are equal to appropriate sum of squares divided by corresponding dof.

That is,

It can be shown that MSE is an unbiased estimate of σ2 , that is :

, similarly ;

Under hypothesis, has an F-distribution with (k-1) and (N-k) dof.

(48)

ANOVA TABLE

for the Completely Raneomizee Design with unequal sample sizes

SOURCE eof SS MS F

Among

Populations or

Treatments

k-1 SSA

ERROR N-k SSE

TOTAL N-1 SST

Sometimes, SSA be denoted SSTR SSE be denoted SSER

(49)

(50)

ANOVA F-TEST FOR A CRD

with k treatments

H₀: µ₁= µ₂= … =µ_k

(i.e., there is no diference in the treatment means) versus

H_a : At least two of the treatment means difer.

Test Statistic :

(51)

PARTITIONING OF THE TOTAL SUM OF SQUARES FOR THE COMPLETELY RANDOMIZED DESIGN

51

TOTAL SUM OF SQUARES

(SSTO)

SUM OF SQUARES FOR TREATMENTS

(SSTR)

(52)

FORMULAS FOR THE CALCULATIONS IN THE CRD

SSTR = sum of squares for treatments

= (sum of squares of treatment totals with each square divided by number of observations for

(53)

53

(54)

EXAMPLE

54

For group of students were subjected to diferent teaching

techniques and tested at the end of a specifed period of time. As a result of drop outs from the experimental groups (due to sickness, transfer, and so on) the number of students varied from group to group. Do the data shown in table (below)

present sufcient evidence to indicate a diference in the mean achievement for the four teaching techniques ??

(55)

S O L U T I O N

55

(56)

The test statistic for testing H0 : µ1 = µ2 = µ3= µ4 is

The critical value of F for α = 0.05 is

reject H0

(57)

THE RANDOMIZED BLOCK DESIGN

57

The randomized block design implies the presence of two quantitative independent variables, “blocks” and “treatments”

(58)

CRD RBD

SST

O

SSTR

SSER SSBL SSTR

(59)

59

Defnition :

A randomized block design is a design devised to compare the means for k

treatments utilizing b matches blocks of k experimental unit each. Each treatment appears once in every block.

The observations in a RBD can be represented by an array of the following type :

(60)

As before, the expectation of Yij the ith observation

from the jth treatment (population), was given by :

In this section / RBD, the assumption about Yij is

that :

(i) ; i = 1,2, … , t

j = 1, 2, … , b

with and

The observation Yij is said that to be the

observation from block j on treatment i.

(61)

61

Hence,

overall efect block efect

treatment efect

One task is to test the null hypothesis

(62)

Here, the ith treatment mean is :

The jth block mean is :

And the overall mean is :

(63)

63

The degrees of freedom are partitioned as follows :

eof TO = eof TR + eof BL + eof

ER

bt – 1 = (t-1) + (b-1) + (b-1)(t-1)

If the null hypothesis of no treatment

(64)

It can be further shown, under :

Hence, using an level test, we reject in favor of if :

For reasons analogous, a test of :

versus

, can be carried out using the critical region :

(65)

65

(66)

GENERAL FORM OF THE RANDOMIZED

Although we show thetreatments in order within the blocks, in practice they would be assigned to the experimental units in a

(67)

FORMULAS FOR CALCULATIONS IN RBD

67

where,

N = ioial number of observaiions

b = number of blocks

(68)

(69)

EXAMPLE

69

A study was conducted in a large city to compare the

supermarket prices of the four leading brands of cofee at the end of the year. Ten supermarkets in the city were selected, and the price per pound was recorded for each brand.

1. Set up the test of the null hypothesis that the mean

prices of the four brands sold in the city were the same at the end of the year. Use α = 0,05

2. Calculate the F statistic

3. Do the data provide sufcient evidence to indicate a

(70)

(71)

71

(72)

(73)

73

Since the calculation F >F_0,05 , there is very

(74)

74

dof for the test statistic are

(75)

NON PARAMETRIC TEST

75

The majority of hypothesis tests discussed so far have made

inferences about population parameters, such as the mean and

the proportion. These parametric tests have used the parametric

statistics of samples that came from the population being tested.

To formulate these tests, we made restrictive assumptions about

the populations from which we drew our samples. For example,

we assumed that our samples either were large or came from

normally distributed populations. But populations are not always

(76)

And even if a goodness-of-ft test indicates that a population is approximately normal. We cannot always be sure we’re right, because the test is not 100 percent reliable.

Fortunately, in recent times statisticians have develops useful techniques that do noi make resiriciive assumption about the shape of population distribution.

These are known as distribution – free or, more commonly,

nonparameiric iesi.

Non parametric statistical procedures in preference to their parametric counterparts.

(77)

77

NON PARAMETRIC TESTS

SIGN TEST

WILCOXON SIGNED RANK TEST

MANN – WHITNEY TEST (WILCOXON RANK SUM TEST)

RUN TEST

KRUSKAL – WALLIS TEST

KOLMOGOROV – SMIRNOV TEST

(78)

THE SIGN TEST

The sign test is used to test hypotheses about the median of a continuous distribution. The median of a distribution is a value of the random variable X such that the probability is 0,5 that an observed

value of X is less than or equal to the median, and the probability is 0,5 that an observed value of X is greater than or equal to the median. That is,

(79)

79

Let X denote a continuous random variable with median and let

denote a random sample of size n from the population of interest.

If denoted the hypothesized value of the

(80)

Form the diferences :

Now if the null hypothesis is true,

any diference is equally likely to be positive or negative. An appropriate test statistic is the number of these diferences that are positive, say . Therefore, to test the null hypothesis we are really testing that the number of plus signs is a value of a Binomial random variable that has the parameter p = 0,5 .

A p-value for the observed number of plus signs can be calculated directly from the Binomial distribution. Thus, if the computed p-value.

(81)

81

To test the other one-sided hypothesis,

vs

is less than or equal α, we will reject .

The two-sided alternative may also be tested. If the hypotheses are:

(82)

It is also possible to construct a table of critical value for the sign test.

As before, let denote the number of the

diferences that are positive and let denote the number of the diferences that are negative.

Let , table of critical values for the sign test that ensure that

If the observed value of the test-statistic , the the null hypothesis should be

(83)

83

If the alternative is ,

then reject if . If the alternative is ,

then reject if .

(84)

Since the underlying population is assumed to be continuous, there is a zero probability that we will fnd a “tie” , that is , a value of

exactly equal to .

When ties occur, they should be set aside and the sign test applied to the remaining data.

(85)

85

When , the Binomial distribution is well approximated by a normal distribution when n is at least 10. Thus, since the mean of the Binomial is and the variance is , the distribution of is approximately normal with mean 0,5n and variance 0,25n whenever n is moderately large.

Therefore, in these cases the null hypothesis can be tested using the statistic :

THE NORMAL

(86)

Critical Regions/Rejection Regions for α-level tests of :

versus

are given in this table :

CRITICAL/REJECTION REGIONS FOR _Alternative

(87)

THE WILCOXON SIGNED-RANK TEST

87

The sign test makes use only of the plus and minus signs of the diferences between the observations and the median (the plus and minus signs of the diferences between the observations in the paired case).

Frank Wilcoxon devised a test procedure that uses both direction (sign) and magnitude.

This procedure, now called the Wilcoxon signed-rank test.

The Wilcoxon signed-rank test applies to the case of the symmetric continuous distributions.

(88)

Description of the test :

We are interested in testing,

(89)

89

Assume that is a random sample from a continuous and symmetric distribution with mean/median : .

Compute the diferences , i = 1, 2, … n

Rank the absolute diferences , and then give the ranks the signs of their corresponding

diferences.

(90)

If the sample size is moderately large (n>20), then it can be shown that or has approximately a normal distribution with mean

and variance

Therefore, a test of can be based on the statistic

(91)

Wilcoxon Signed-Rank Test

91

Test statistic :

Theorem : The probability distribution of when is true, which is based on a random sample of size n, satisfes :

(92)

Proof :

Let if , then

where

For a given , the discrepancy has a

(93)

93

(94)

(95)

95

The Wilcoxon signed-rank test can be applied to paired data.

Let ( ) , j = 1,2, …n be a collection of paired observations from two continuous distributions that difer only with respect to their means. The distribution of the diferences is continuous and symmetric.

The null hypothesis is : , which is equivalent to

To use the Wilcoxon signed-rank test, the diferences are frst ranked in ascending order of their absolute values, and then the ranks are given the signs of the diferences.

(96)

Let be the sum of the positive ranks and be the absolute value of the sum of the negative ranks, and .

If the observed value , then is rejected and accepted.

(97)

EXAMPLE

97

Eleven students were randomly selected from a large statistics class, and their numerical grades on two

successive examinations were recorded.

Use the Wilcoxon signed rank test to determine

(98)

solution :

Jumlah ranks positif :

TOLAK H₀

(99)

EXAMPLE

99

Ten newly married couples were randomly selected, and each husband and wife were

independently asked the question of how many children they would like to have. The following information was obtained.

Using the sign test, is test reason to believe that wives want fewer children than husbands?

Assume a maximum size of type I error of 0,05

(100)

SOLUSI

Tetapkan dulu H₀ dan H₁ : H₀ : p = 0,5

vs H₁ : p < 0,5

Ada tiga tanda +.

Di bawah H₀ , S ~ BIN (9 , 1/2)

P(S ≤ 3) = 0,2539

Pada peringkat α = 0,05 , karena 0,2539 > 0,05

maka H jangan ditolak.

Pasang

an 1 2 3 4 6 7 8 9 10

(101)

-THE WILCOXON

RANK-SUM TEST

THE WILCOXON

RANK-SUM TEST

101

Suppose that we have two independent

continuous populations X₁ and X₂ with means

µ₁ and µ_2. Assume that the distributions of X₁

and X₂ have the same shape and spread, and

difer only (possibly) in their means.

The Wilcoxon rank-sum test can be used to test the hypothesis

H₀ : µ₁= µ_2. This procedure is sometimes called

(102)

Description of the Test

Let and be two independent random samples of sizes

from the continuous populations X₁ and X_2.

We wish to test the hypotheses : H₀ : µ₁= µ₂

versus H₁ : µ₁≠ µ₂

The test procedure is as follows. Arrange all n₁

+ n₂ observations in ascending order of

(103)

103

Let W₁ be the sum of the ranks in the smaller

sample (1), and defne W₂ to be the sum of the

ranks in the other sample. Then,

Now if the sample means do not difer, we will expect the sum of the ranks to be nearly equal for both samples after adjusting for the

diference in sample size. Consequently, if the sum of the ranks difer greatly, we will conclude that the means are not equal.

(104)

H₀ : µ₁= µ₂ is rejected, if either of the observed values

w₁ or w₂ is less than or equal w_α

If H₁ : µ₁< µ_2, then reject H₀ if w₁ ≤ w_α

(105)

LARGE-SAMPLE APPROXIMATION

105

When both n₁ and n₂ are moderately large, say,

greater than 8, the distribution of W₁ can be well

approximated by the normal distribution with mean :

(106)

Therefore, for n₁ and n₂ > 8, we could use :

as a statistic, and critical region is :

 _ two-tailed test

 _ upper-tail test

(107)

EXAMPLE

107

A large corporation is suspected of

sex-discrimination in the salaries of its employees. From employees with similar responsibilities and work experience, 12 male and 12 female

employees were randomly selected ; their annual salaries in thousands of dollars are as follows :

(108)

SOLUSI

H₀ : f₁(x) = f₂(x)  APA ARTINYA??

random samples berasal dari

populasi dengan distribusi yang sama

H₁ : f₁(x) ≠ f₂(x)

Gabungkan dan buat peringkat salaries :

(109)

CONT’D...

109

M 21,9 12

M 22,3 13

M 22,4 14

F 22,5 15

F 23,2 16

M 23,4 17

F 23,5 18

M 23,6 19

M 23,9 20

M 24,0 21

M 24,1 22

M 24,5 23

(110)

Andaikan, kita pilih sampel dari female, maka jumlah peringkatnya

R₁ = R_F= 117

Statistic

(111)

111

Grafk

α = 0,05

Z_hit= 1,91

maka

terima H₀

-1,96 1,96

(112)

KOLMOGOROV – SMIRNOV TEST

112

The Kolmogorov-Smirnov Test (K-S) test is conducted by the comparing the hypothesized and sample cumulative distribution function.

A cumulative distribution function is defned as : and the sample cumulative distribution function, S(x), is defned as the proportion of sample values that are less than or equal to x.

The K-S test should be used instead of the to determine if a sample is from a specifed continuous distribution.

(113)

113

We begin by placing the values of x in ascending order, as follows :

80, 89, 93, 97, 102, 103, 105, 108, 110, 121.

(114)

The test statistic D is the maximum- absolute diference between the two cdf’s over all

observed values.

The range on D is 0 ≤ D ≤ 1, and the formula is

where x = each observed value

S(x) = observed cdf at x

(115)

115

Let X₍₁₎ , X₍₂₎, …. , X_(n) denote the ordered

observations of a random sample of size n, and defne the sample cdf as :

is the proportion of the number of sample values less than

(116)

The Kolmogorov – Smirnov statistic, is defned to be :

(117)

EXAMPLE 1

117

A state vehicle inspection station has been designed so that inspection time follows a uniform distribution with limits of 10 and 15 minutes.

A sample of 10 duration times during low and peak trafc conditions was taken. Use the K-S test with α = 0,05 to determine if the sample is from this uniform distribution. The time are : 11,3 10,4 9,8 12,6 14,8

(118)

SOLUTION

1. H₀ : sampel berasal dari distribusi Uniform

(10,15) versus

H₁: sampel tidak berasal dari distribusi

Uniform (10,15)

2. Fungsi distribusi kumulatif dari sampel : S (x)

(119)

Hasil Perhitungan dari K-S

119

Waktu Pengamata

n x S(x) F(x)

9,8 0,10 0,00 0,10

10,4 0,20 0,08 0,12

11,3 0,30 0,26 0,04

11,5 0,40 0,30 0,10

12,6 0,50 0,52 0,02

13,0 0,60 0,60 0,00

13,3 0,70 0,66 0,04

13,6 0,80 0,72 0,08

14,3 0,90 0,86 0,04

(120)

, untuk x = 10,4

Dalam tabel , n = 10 , α = 0,05  D_10,0.05 = 0,41

f(D)

α = P(D ≥ D₀)

D0 D

(121)

EXAMPLE 2

121

Suppose we have the following ten observations 110, 89, 102, 80, 93, 121, 108, 97, 105,

103 ;

were drawn from a normal distribution, with

mean µ = 100 and standard-deviation σ = 10. Our hypotheses for this test are

H₀: Data were drawn from a normal distribution,

with µ = 100

and σ = 10. versus

H₁: Data were not drawn from a normal

(122)

(123)

123

x F(x) S(x)

80 0,0228 0,1 0,0772

89 0,1357 0,2 0,0643

93 0,2420 0,3 0,0580

97 0,3821 0,4 0,0179

102 0,5793 0,5 0,0793 =

103 0,6179 0,6 0,0179

105 0,6915 0,7 0,0085

108 0,7881 0,8 0,0119

110 0,8413 0,9 0,0587

(124)

Jika α = 0,05 , maka critical value, dengan n=10 diperoleh di tabel = 0,409.

Aturan keputusannya, tolak H₀ jika D > 0,409

Karena

H₀ jangan ditolak atau terima H₀ .

Artinya, data berasal dari distribusi normal dengan µ = 100 dan

(125)

LILLIEFORS TEST

125

In most applications where we want to test for normality, the population mean and the population variance are known.

In order to perform the K-S test, however, we must assume that those parameters are known. The Lilliefors test, which is quite similar to the K-S test.

(126)

EXAMPLE

A manufacturer of automobile seats has a production line that produces an average of 100 seats per day. Because of new government regulations, a new safety device has been installed, which the manufacturer believes will reduce average daily output.

A random sample of 15 days’ output after the installation of the safety device is shown:

93, 103, 95, 101, 91, 105, 96, 94, 101, 88, 98, 94, 101, 92, 95

(127)

SOLUSI

127

Seperti pada uji K-S, untuk menghitung S (x) urutkan,

sbb : _x _S(x)

88 1/15 = 0,067

91 2/15 = 0,133

92 3/15 = 0,200

93 4/15 = 0,267

94 6/15 = 0,400

95 8/15 = 0,533

96 9/15 = 0,600

98 10/15 = 0,667

101 13/15 = 0,867

103 14/15 = 0,933

(128)

Dari data di atas, diperoleh dan s = 4,85 .

Selanjutnya F(x) dihitung sbb :

x F(x)

88

(129)

(130)

TEST BASED ON RUNS

Usually a sample that is taken from a population should be random.

The runs iesi evaluates the null hypothesis

H₀ : the order of the sample data is random

The alternative hypothesis is simply the negation

of H_0. There is no comparable parametric test to

evaluate this null hypothesis.

(131)

131

DEFINITIONS :

1. A run is defned as a sequence of the same

symbols.

Two symbols are defned, and each sequence must contain a symbol at least once.

2. A run of length j is defned as a sequence of j

observations, all belonging to the same group, that is preceded or followed by observations belonging to a diferent group.

For illustration, the ordered sequence by the sex of the employee is as follows :

F F F M F F F M M F F M M M F F M F M M M M M F

(132)

132

The sequence begins with a run of length three, followed by a run of length one, followed by another run of length three, and so on.

The total number of runs in this sequence is 11.

Let R be the total number of runs observed in an

ordered sequence of n₁+ n₂ observations, where

n₁ and n₂ are the respective sample sizes. The

possible values of R are 2, 3, 4, …. (n₁+ n₂ ).

The only question to ask prior to performing the test is, Is the sample size small or large?

We will use the guideline that a small sample has n₁ and n₂ less than or equal to 15.

(133)

133

If n₁ or n₂ exceeds 15, the sample is considered

large, in which case a normal approximation to f(r) is used to test H₀ versus H_1.

f(r )

r

AR

r_L r

(134)

The mean and variance of R are determined to be

normal

(135)

THE KRUSKAL - WALLIS H TEST

135

The Kruskal – Wallis H test is the nonparametric equivalent of the Analysis of Variance F test.

It test the null hypothesis that all k populations possess the same probability distribution against the alternative hypothesis that the distributions difer in location – that is, one or more of the distributions are shifted to the right or left of each other.

The advantage of the Kruskall – Wallis H test over the F test is that we need make no assumptions about the nature of sampled populations.

(136)

To conduct the test, we frst rank all :

n = n₁ + n₂ + n₃ + … +n_k observations and

compute the rank sums, R₁ , R₂ , …, R_k for the k samples.

The ranks of tied observations are averaged in the same manner as for the WILCOXON rank sum test. Then, if H₀ is true, and if the sample sizes n₁ , n₂ ,

…, n_k each equal 5 or more, then the test statistic

is defned by :

will have a sampling distribution that can be

(137)

137

Therefore, the rejection region for the test is , where is the value that

located α in the upper tail of the chi- square distribution.

(138)

KRUSKAL – WALLIS H TEST

FOR COMPARING k POPULATION PROBABILITY DISTRIBUTIONS

138

H₀ : The k population probability distributions

are identical

H₁ : At least two of the k population probability

distributions

is computed according to its relative magnitude in the totality

H₀ : The k population probability distributions

are identical

H₁ : At least two of the k population probability

distributions

(139)

139

n = Total sample size = n₁ + n₂ + … +n_k

Rejection Region : with (k-1) dof Assumptions :

1. The k samples are random and independent 2. There are 5 or more measurements in each sample

3. The observations can be ranked

No assumptions have to be made about the

shape of the population probability distributions.

n = Total sample size = n₁ + n₂ + … +n_k

Rejection Region : with (k-1) dof Assumptions :

1. The k samples are random and independent 2. There are 5 or more measurements in each sample

3. The observations can be ranked

No assumptions have to be made about the

(140)

Example

140

Independent random samples of three diferent brands of magnetron tubes (the key components in microwave ovens) were subjected to stress testing, and the number of hours each operated without repair was recorded. Although these times do not represent typical life lengths, they do indicate how well the tubes can withstand extreme stress. The data are shown in table (below). Experience has shown that the distributions of life lengths for manufactured product are often non normal, thus violating the assumptions required for the proper use of an ANOVA F test.

(141)

141

BRAND

A B C

36 49 71

48 33 31

5 60 140

67 2 59

(142)

Solusi

Lakukan ranking/peringkat dan jumlahkan peringkat dari 3 sample tersebut.

H0 : the population probability distributions of length of

life under

stress are identical for the three brands of magnetron tubes.

versus

H1 : at least two of the population probability distributions

(143)

143

Test statistic :

H₀ ???

f(H)

H 1,2

2