• Tidak ada hasil yang ditemukan

Materi Analisis Data Kategori

N/A
N/A
Protected

Academic year: 2017

Membagikan "Materi Analisis Data Kategori"

Copied!
57
0
0

Teks penuh

(1)

Contingency Tables

Contingency Tables

1

1

.

.

Explain

Explain

22

Test of Independence

Test of Independence

2

(2)

Contingency Tables

Contingency Tables

Tables representing all combinations

Tables representing all combinations

of levels of explanatory and response

of levels of explanatory and response

variables

variables

Numbers in table represent

Numbers in table represent

Counts

Counts

of the number of cases in each cell

of the number of cases in each cell

Row and column totals are called

Row and column totals are called

Marginal counts

(3)

2x2 Tables

2x2 Tables

Each variable has 2 levels

Each variable has 2 levels

Explanatory Variable – Groups (Typically

Explanatory Variable – Groups (Typically

based on demographics, exposure)

based on demographics, exposure)

(4)

2x2 Tables - Notation

2x2 Tables - Notation

Outcome

Present OutcomeAbsent GroupTotal

Group 1 n11 n12 n1.

Group 2 n21 n22 n2.

Outcome

(5)

(6)

2

2

Test of Independence

Test of Independence

1.

1.

Shows If a Relationship Exists

Shows If a Relationship Exists

Between 2 Qualitative Variables

Between 2 Qualitative Variables

One Sample Is Drawn

One Sample Is Drawn

Does

Does

Not

Not

Show Causality

Show Causality

2.

2.

Assumptions

Assumptions

Multinomial Experiment

Multinomial Experiment

All Expected Counts

All Expected Counts

5

5

(7)

2

2

Test of Independence

Test of Independence

Contingency Table

Contingency Table

1.

1.

Shows # Observations From 1

Shows # Observations From 1

Sample Jointly in 2 Qualitative

Sample Jointly in 2 Qualitative

Variables

(8)

2

2

Test of Independence

Test of Independence

Contingency Table

Contingency Table

1.

1.

Shows # Observations From 1

Shows # Observations From 1

Sample Jointly in 2 Qualitative

Sample Jointly in 2 Qualitative

Variables

Variables

Levels of variable 2Levels of variable 2
(9)

2

2

Test of Independence

Test of Independence

Hypotheses & Statistic

Hypotheses & Statistic

1.

1.

Hypotheses

Hypotheses

H

H

00

: Variables Are Independent

: Variables Are Independent

(10)

2

2

Test of Independence

Test of Independence

Hypotheses & Statistic

Hypotheses & Statistic

1.

1.

Hypotheses

Hypotheses

H

H

00

: Variables Are Independent

: Variables Are Independent

H

H

aa

: Variables Are Related (Dependent)

: Variables Are Related (Dependent)

2.

2.

Test Statistic

Test Statistic

Observed countObserved count

Expected Expected count

count

2

2

n

E n

E n

ij ij

ij

c h

c h

all cells

2

2

n

E n

E n

ij ij

ij

c h

c h

(11)

2

2

Test of Independence

Test of Independence

Hypotheses & Statistic

Hypotheses & Statistic

1.

1.

Hypotheses

Hypotheses

H

H

00

: Variables Are Independent

: Variables Are Independent

H

H

aa

: Variables Are Related (Dependent)

: Variables Are Related (Dependent)

2.

2.

Test Statistic

Test Statistic

Degrees of Freedom: (

Degrees of Freedom: (

r

r

- 1)(

- 1)(

RowsRows

c

c

- 1)

- 1)

Columns Columns Observed count

Observed count

Expected

Expected

count

count

2

2

n

E n

E n

ij ij

ij

c h

c h

all cells

2

2

n

E n

E n

ij ij

ij

c h

c h

(12)

2

2

Test of Independence

Test of Independence

Expected Counts

Expected Counts

1.

1.

Statistical Independence Means

Statistical Independence Means

Joint Probability Equals Product of

Joint Probability Equals Product of

Marginal Probabilities

Marginal Probabilities

2.

2.

Compute Marginal Probabilities &

Compute Marginal Probabilities &

Multiply for Joint Probability

Multiply for Joint Probability

3.

3.

Expected Count Is Sample Size

Expected Count Is Sample Size

Times Joint Probability

(13)
(14)

Location

Urban Rural

House Style

Obs.

Obs.

Total

Split-Level

63

49

112

Ranch

15

33

48

Total

78

82

160

Location

Urban Rural

House Style

Obs.

Obs.

Total

Split-Level

63

49

112

Ranch

15

33

48

Total

78

82

160

(15)

Expected Count Example

Expected Count Example

112 112 160 160 Marginal probability =

(16)

Location

Urban Rural

House Style

Obs.

Obs.

Total

Split-Level

63

49

112

Ranch

15

33

48

Total

78

82

160

Location

Urban Rural

House Style

Obs.

Obs.

Total

Split-Level

63

49

112

Ranch

15

33

48

Total

78

82

160

Expected Count Example

Expected Count Example

112

112

160

160

78 78 160 160

Marginal probability =

Marginal probability =

(17)

Location

Urban Rural

House Style

Obs.

Obs.

Total

Split-Level

63

49

112

Ranch

15

33

48

Total

78

82

160

Location

Urban Rural

House Style

Obs.

Obs.

Total

Split-Level

63

49

112

Ranch

15

33

48

Total

78

82

160

Expected Count Example

Expected Count Example

112

112

160

160

78

78

160

160

Marginal probability =

Marginal probability =

Marginal probability =

Marginal probability =

Joint probability =

Joint probability = 112 112 160 160

(18)

Location

Urban Rural

House Style

Obs.

Obs.

Total

Split-Level

63

49

112

Ranch

15

33

48

Total

78

82

160

Location

Urban Rural

House Style

Obs.

Obs.

Total

Split-Level

63

49

112

Ranch

15

33

48

Total

78

82

160

Expected Count Example

Expected Count Example

112 112 160 160 78 78 160 160

Marginal probability =

Marginal probability =

Marginal probability =

Marginal probability =

Joint probability =

Joint probability = 112 112

160 160 78 78 160 160

Expected count = 160·

(19)
(20)

Expected Count Calculation

Expected Count Calculation

Expected count =

Row total

Column total

Sample size

a

fa

f

Expected count =

Row total

Column total

Sample size

(21)

Expected Count Calculation

Expected Count Calculation

112·82 112·82

160 160

48·78 48·78

160 160

48·82 48·82

160 160 112·78

112·78 160 160

Expected count =

Row total

Column total

Sample size

a

fa

f

Expected count =

Row total

Column total

Sample size

(22)

Diet Pepsi

Diet Coke

No

Yes

Total

No

84

32

116

Yes

48

122

170

Total

132

154

286

Diet Pepsi

Diet Coke

No

Yes

Total

No

84

32

116

Yes

48

122

170

Total

132

154

286

You’re a marketing research analyst. You

You’re a marketing research analyst. You

ask a random sample of

ask a random sample of

286

286

consumers if

consumers if

they purchase Diet Pepsi or Diet Coke. At

they purchase Diet Pepsi or Diet Coke. At

the

the

.05

.05

level, is there evidence of a

level, is there evidence of a

relationship

relationship

?

?

2

2

Test of Independence

Test of Independence

(23)

2

2

Test of Independence

Test of Independence

(24)

2

2

Test of Independence

Test of Independence

Solution

Solution

H

H

00

:

:

H

H

aa

:

:

=

=

df =

df =

Critical Value(s):

Critical Value(s):

Test Statistic:

Test Statistic:

Decision:

Decision:

Conclusion:

Conclusion:

2

0

Reject

2

0

(25)

2

2

Test of Independence

Test of Independence

Solution

Solution

H

H

00

:

:

No

No

Relationship

Relationship

H

H

aa

:

:

Relationship

Relationship

=

=

df =

df =

Critical Value(s):

Critical Value(s):

Test Statistic:

Test Statistic:

Decision:

Decision:

Conclusion:

Conclusion:

2

0

Reject

2

0

(26)

2

2

Test of Independence

Test of Independence

Solution

Solution

H

H

00

:

:

No

No

Relationship

Relationship

H

H

aa

:

:

Relationship

Relationship

=

=

.05

.05

df =

df =

(2 - 1)(2 - 1)

(2 - 1)(2 - 1)

= 1

= 1

Critical Value(s):

Critical Value(s):

Test Statistic:

Test Statistic:

Decision:

Decision:

Conclusion:

Conclusion:

2

0

Reject

2

0

(27)

2

2

Test of Independence

Test of Independence

Solution

Solution

H

H

00

:

:

No

No

Relationship

Relationship

H

H

aa

:

:

Relationship

Relationship

=

=

.05

.05

df =

df =

(2 - 1)(2 - 1)

(2 - 1)(2 - 1)

= 1

= 1

Critical Value(s):

Critical Value(s):

Test Statistic:

Test Statistic:

Decision:

Decision:

Conclusion:

Conclusion:

2

0 3.841 Reject

2

0 3.841

Reject

(28)

E

E((nnijij))  5 in all 5 in all cells

cells

170·132 170·132

286 286

170·154 170·154

286 286 116·132

116·132 286 286

154·1 154·11616

286 286

2

2

Test of Independence

Test of Independence

(29)

2 2 11 11 2 11 12 12 2 12 22 22 2 22

2 2 2

84 53 5

53 5

32 62 5

62 5

122 915

915

54 29

 

 

n

E n

E n

n

E n

E n

n

E n

E n

n

E n

E n

ij ij ij

.

.

.

.

.

.

.

c h

c h

a f

a f

a f

a f

a f

a f

all cells

2 2 11 11 2 11 12 12 2 12 22 22 2 22

2 2 2

84 53 5

53 5

32 62 5

62 5

122 915

915

54 29

 

 

n

E n

E n

n

E n

E n

n

E n

E n

n

E n

E n

ij ij ij

.

.

.

.

.

.

.

c h

c h

a f

a f

a f

a f

a f

a f

all cells

2

2

Test of Independence

Test of Independence

(30)

2

2

Test of Independence

Test of Independence

Solution

Solution

H

H

00

:

:

No

No

Relationship

Relationship

H

H

aa

:

:

Relationship

Relationship

= .05

= .05

df

df

= (2 - 1)(2 - 1)

= (2 - 1)(2 - 1)

= 1

= 1

Critical Value(s):

Critical Value(s):

Test Statistic:

Test Statistic:

Decision:

Decision:

Conclusion:

Conclusion:

2

0 3.841 Reject

2

0 3.841

Reject

= .05= .05

(31)

2

2

Test of Independence

Test of Independence

Solution

Solution

H

H

00

:

:

No

No

Relationship

Relationship

H

H

aa

:

:

Relationship

Relationship

= .05

= .05

df

df

= (2 - 1)(2 - 1)

= (2 - 1)(2 - 1)

= 1

= 1

Critical Value(s):

Critical Value(s):

Test Statistic:

Test Statistic:

Decision:

Decision:

Conclusion:

Conclusion:

Reject at

Reject at

= .05

= .05

2

0 3.841 Reject

2

0 3.841

Reject

= .05= .05

(32)

2

2

Test of Independence

Test of Independence

Solution

Solution

H

H

00

:

:

No

No

Relationship

Relationship

H

H

aa

:

:

Relationship

Relationship

= .05

= .05

df

df

= (2 - 1)(2 - 1)

= (2 - 1)(2 - 1)

= 1

= 1

Critical Value(s):

Critical Value(s):

Test Statistic:

Test Statistic:

Decision:

Decision:

Conclusion:

Conclusion:

Reject at

Reject at

= .05

= .05

There is evidence of a

There is evidence of a

relationship

relationship

2

0 3.841 Reject

2

0 3.841

Reject

= .05= .05

(33)

Siskel and Ebert

Siskel and Ebert

• | Ebert

• Siskel | Con Mix Pro | Total

---+---+---• Con | 24 8 13 | 45

• Mix | 8 13 11 | 32

• Pro | 10 9 64 | 83

(34)

Siskel and Ebert

Siskel and Ebert

• | Ebert

• Siskel | Con Mix Pro | Total

---+---+---• Con | 24 8 13 | 45

• | 11.8 8.4 24.8 | 45.0

---+---+---• Mix | 8 13 11 | 32

• | 8.4 6.0 17.6 | 32.0

---+---+---• Pro | 10 9 64 | 83

• | 21.8 15.6 45.6 | 83.0

---+---+---• Total | 42 30 88 | 160

• | 42.0 30.0 88.0 | 160.0

(35)

Yate’s Statistics

Yate’s Statistics

Method of testing for association for

Method of testing for association for

2x2 tables when

2x2 tables when

sample size is

sample size is

moderate ( total observation

moderate ( total observation

between 6 – 25)

between 6 – 25)

ij i j

ij ij

e

e

O



2

2

5

.

0

(36)

End of Chapter

Any blank slides that follow are

blank intentionally.

Measures of association

Measures of association

Relative Risk

Relative Risk

Odds Ratio

Odds Ratio

(37)

Relative Risk

Relative Risk

Ratio of the probability that the outcome

Ratio of the probability that the outcome

characteristic is present for one group,

characteristic is present for one group,

relative to the other

relative to the other

Sample proportions with characteristic

Sample proportions with characteristic

from groups 1 and 2:

from groups 1 and 2:

. 2 21 2

^

. 1 11 1

^

n

n

n

n

(38)

Relative Risk

Relative Risk

Estimated Relative Risk:

Estimated Relative Risk:

2 ^ 1 ^

RR

95% Confidence Interval for Population Relative Risk:

21 2 ^ 11 1 ^ 96 . 1 96 . 1 ) 1 ( ) 1 ( 71828 . 2 ) ) ( , ) ( ( n n v e e RR e

RR v v

(39)

Relative Risk

Relative Risk

Interpretation

Interpretation

– Conclude that the probability that the outcome Conclude that the probability that the outcome

is present is higher (in the population) for group

is present is higher (in the population) for group

1 if the entire interval is above 1

1 if the entire interval is above 1

– Conclude that the probability that the outcome Conclude that the probability that the outcome

is present is lower (in the population) for group 1

is present is lower (in the population) for group 1

if the entire interval is below 1

if the entire interval is below 1

– Do not conclude that the probability of the Do not conclude that the probability of the

outcome differs for the two groups if the interval

outcome differs for the two groups if the interval

contains 1

(40)

Example - Coccidioidomycosis and

Example - Coccidioidomycosis and

TNF

TNF

-antagonists

-antagonists

• Research Question: Risk of developing Coccidioidmycosis associated with arthritis therapy?

• Groups: Patients receiving tumor necrosis factor  (TNF)

versus Patients not receiving TNF (all patients arthritic)

COC No COC Total TNF 7 240 247

Other 4 734 738 Total 11 974 985

(41)

Example - Coccidioidomycosis and

Example - Coccidioidomycosis and

TNF

TNF

-antagonists

-antagonists

• Group 1: Patients on TNF

• Group 2: Patients not on TNF

)

76

.

17

,

55

.

1

(

)

24

.

5

,

24

.

5

(

:

%

95

3874

.

4

0054

.

1

7

0283

.

1

24

.

5

0054

.

0283

.

0054

.

738

4

0283

.

247

7

3874 . 96 . 1 3874 . 96 . 1 2 ^ 1 ^ 2 ^ 1 ^

e

e

CI

v

RR

(42)

Odds Ratio

Odds Ratio

Odds of an event is the probability it occurs Odds of an event is the probability it occurs divided by the probability it does not occur

divided by the probability it does not occur

Odds ratio is the odds of the event for group 1 Odds ratio is the odds of the event for group 1 divided by the odds of the event for group 2

divided by the odds of the event for group 2

Sample odds of the outcome for each group:Sample odds of the outcome for each group:

22 21 2

12 11

. 1 12

. 1 11

1

/ /

n n odds

n n n

n

n n

odds

(43)

Odds Ratio

Odds Ratio

• Estimated Odds Ratio:

21 12 22 11 22 21 12 11 2 1

/

/

n

n

n

n

n

n

n

n

odds

odds

OR

95% Confidence Interval for Population Odds Ratio

22 21 12 11 96 . 1 96 . 1 1 1 1 1 71828 . 2 ) ) ( , ) ( ( n n n n v e e OR e

OR v v

(44)

Odds Ratio

Odds Ratio

Interpretation

Interpretation

– Conclude that the probability that the outcome Conclude that the probability that the outcome

is present is higher (in the population) for group

is present is higher (in the population) for group

1 if the entire interval is above 1

1 if the entire interval is above 1

– Conclude that the probability that the outcome Conclude that the probability that the outcome

is present is lower (in the population) for group 1

is present is lower (in the population) for group 1

if the entire interval is below 1

if the entire interval is below 1

– Do not conclude that the probability of the Do not conclude that the probability of the

outcome differs for the two groups if the interval

outcome differs for the two groups if the interval

contains 1

(45)

Example - NSAIDs and GBM

Example - NSAIDs and GBM

Case-Control Study (Retrospective)

Case-Control Study (Retrospective)

– Cases: 137 Self-Reporting Patients with Glioblastoma Cases: 137 Self-Reporting Patients with Glioblastoma

Multiforme (GBM)

Multiforme (GBM)

– Controls: 401 Population-Based Individuals matched to Controls: 401 Population-Based Individuals matched to

cases wrt demographic factors

cases wrt demographic factors

GBM Present GBM Absent

Total

NSAID User

32

138

170

NSAID Non-User

105

263

368

Total

137

401

538

(46)

Example - NSAIDs and GBM

Example - NSAIDs and GBM

) 91 . 0 , 37 . 0 ( ) 58 . 0 , 58 . 0 ( : % 95 0518 . 0 263 1 105 1 138 1 32 1 58 . 0 14490 8416 ) 105 ( 138 ) 263 ( 32 0518 . 0 96 . 1 0518 . 0 96 . 1           e e CI v OR

(47)

Absolute Risk

Absolute Risk

Difference Between Proportions of outcomes

Difference Between Proportions of outcomes

with an outcome characteristic for 2 groups

with an outcome characteristic for 2 groups

Sample proportions with characteristic

Sample proportions with characteristic

from groups 1 and 2:

from groups 1 and 2:

. 2 21 2

^

. 1 11 1

^

n

n

n

n

(48)

Absolute Risk

Absolute Risk

2 ^ 1 ^

AR

Estimated Absolute Risk:

95% Confidence Interval for Population Absolute Risk

(49)

Absolute Risk

Absolute Risk

Interpretation

Interpretation

– Conclude that the probability that the outcome Conclude that the probability that the outcome

is present is higher (in the population) for group

is present is higher (in the population) for group

1 if the entire interval is positive

1 if the entire interval is positive

– Conclude that the probability that the outcome Conclude that the probability that the outcome

is present is lower (in the population) for group 1

is present is lower (in the population) for group 1

if the entire interval is negative

if the entire interval is negative

– Do not conclude that the probability of the Do not conclude that the probability of the

outcome differs for the two groups if the interval

outcome differs for the two groups if the interval

contains 0

(50)

Example - Coccidioidomycosis and

Example - Coccidioidomycosis and

TNF

TNF

-antagonists

-antagonists

• Group 1: Patients on TNF

• Group 2: Patients not on TNF

) 0242 . 0 , 0016 . 0 ( 0213 . 0229 . 738 ) 9946 (. 0054 . 247 ) 9717 (. 0283 . 96 . 1 0229 . : % 95 0229 . 0054 . 0283 . 0054 . 738 4 0283 . 247 7 2 ^ 1 ^ 2 ^ 1 ^               CI

AR  

 

Interval is entirely positive, TNF is

(51)

Ordinal Explanatory and Response

Ordinal Explanatory and Response

Variables

Variables

Pearson’s Chi-square test can be used to

Pearson’s Chi-square test can be used to

test associations among ordinal variables,

test associations among ordinal variables,

but more powerful methods exist

but more powerful methods exist

When theories exist that the association is

When theories exist that the association is

directional (positive or negative), measures

directional (positive or negative), measures

exist to describe and test for these specific

exist to describe and test for these specific

alternatives from independence:

alternatives from independence:

– GammaGamma

(52)

Concordant and Discordant Pairs

Concordant and Discordant Pairs

Concordant Pairs - Pairs of individuals where

Concordant Pairs - Pairs of individuals where

one individual scores “higher” on both ordered

one individual scores “higher” on both ordered

variables than the other individual

variables than the other individual

Discordant Pairs - Pairs of individuals where

Discordant Pairs - Pairs of individuals where

one individual scores “higher” on one ordered

one individual scores “higher” on one ordered

variable and the other individual scores

variable and the other individual scores

low

low

er” on the other

er” on the other

C

C

= # Concordant Pairs

= # Concordant Pairs

D

D

= # Discordant

= # Discordant

Pairs

Pairs

– Under Positive association, expect Under Positive association, expect CC > > DD

– Under Negative association, expect Under Negative association, expect C < C < DD

(53)

Example - Alcohol Use and Sick

Example - Alcohol Use and Sick

Days

Days

Alcohol Risk (Without Risk, Hardly any Risk,

Alcohol Risk (Without Risk, Hardly any Risk,

Some to Considerable Risk)

Some to Considerable Risk)

Sick Days (0, 1-6,

Sick Days (0, 1-6,

7)

7)

Concordant Pairs - Pairs of respondents

Concordant Pairs - Pairs of respondents

where one scores higher on both alcohol

where one scores higher on both alcohol

risk and sick days than the other

risk and sick days than the other

Discordant Pairs - Pairs of respondents

Discordant Pairs - Pairs of respondents

where one scores higher on alcohol risk and

where one scores higher on alcohol risk and

the other scores higher on sick days

the other scores higher on sick days

(54)

Example - Alcohol Use and Sick

Example - Alcohol Use and Sick

Days

Days

ALCOHOL * SICKDAYS Crosstabulation

Count

347 113 145 605 154 63 56 273 52 25 34 111 553 201 235 989 Without Risk

Hardly any Risk

Some-Considerable Risk ALCOHOL

Total

0 days 1-6 days 7+ days SICKDAYS

Total

• Concordant Pairs: Each individual in a given cell is concordant with each individual in cells

“Southeast” of theirs

(55)

Example - Alcohol Use and Sick

Example - Alcohol Use and Sick

Days

Days

ALCOHOL * SICKDAYS Crosstabulation

Count

347 113 145 605 154 63 56 273 52 25 34 111 553 201 235 989 Without Risk

Hardly any Risk

Some-Considerable Risk ALCOHOL

Total

(56)

Measures of Association

Measures of Association

• Goodman and Kruskal’s Gamma:

1 1 ^ ^          D C D C

• Kendall’s b:

) )(

( 2 2 . 2

. 2 ^

    j i b n n n n D C

When there’s no association between the ordinal variables, the population based values of these measures are 0.

(57)

Example - Alcohol Use and Sick

Example - Alcohol Use and Sick

Days

Days

0617

.

0

73496

83164

73496

83164

^

D

C

D

C

Symmetric Measures

.035 .030 1.187 .235 .062 .052 1.187 .235

989 Kendall's tau-b

Gamma Ordinal by

Ordinal

N of Valid Cases

Value

Asymp.

Std. Errora Approx. Tb Approx. Sig.

Not assuming the null hypothesis. a.

Referensi

Dokumen terkait

Pada penulisan ilmiah ini juga dijelaskan mengenai cara pembuatan aplikasi dari mulai menentukan struktur navigasi, perancangan alur cerita (storyboard), pembuatan tampilan halaman

Peneliti mengambil kesimpulan dari hasil analisis data yang dilakukan untuk mengetahui ada tidaknya perbedaan hasil belajar siswa antara penerapan model

Berdasarkan Surat Kuasa Pengguna Anggaran DINAS KEBERSIHAN PERTAMANAN DAN TATA RUANG Kota Banjarbaru tentang Penetapan Pemenang Lelang Pekerjaan Pekerjaan Pengadaan

Choice of hook depends on several factors such as the quality of the hook, the size of the targeted fish, its preferred bait, feeding habits, the fishing

data yang akan diperoses, dengan kata lain diagram konteks digunakan. untuk menggambarkan system secara umum atau global

Pada penelitian ini, dapat ditarik kesimpulan bahwa terdapat perbedaan bermakna antara peningkatan kadar hemoglobin pada kelompok yang mendapat terapi kurkuminoid ekstrak

Kompetensi umum : Setelah mengikuti pembimbingan Tugas Akhir Program ini mahasiswa diharapkan dapat mengerjakan Tugas Akhir Program sehingga

Kajian ini dilakukan untuk mengenal pasti metodologi penulisan kitab Furū‘ al-Masā’il , menganalisis isi kandungan fiqh ‘ibadah dalam kitab Furū‘ al- Masā’il