Tabel Kontingensi 2x2 (3)

(1)

Tabel Kontingensi 2x2 (3)

Rasio Odds dan Uji Kebebasan

Khi-K d t

Kuadrat

Rasio ODDS

“It occurs as a parameter in the most important type of model for categorical data”

Odds Sukses

(

1

)

odds π π = −

• Odds bernilai positif

• Nilai odss lebih besar dari satu, saat “sukses” lebih dipilih dibandingkan “gagal”

dibandingkan “gagal”

(2)

Rasio Odds Pada Tabel 2x2

A1

A2

B1

π

₁

1-π

₁

B2

π

₂

1-π

₂

(

1

)

1 1 1 odds π π = −

(

2

)

2 2 1 odds π π = − Rasio Odds

Values of θ farther from 1.0 in a given direction represent stronger

association.

3

Properties of OR

• The odds ratio

does not change value

when the

table orientation reverses

so that the rows

table orientation reverses

so that the rows

become the columns and the columns become

the rows.

• Thus, it is unnecessary to identify one

classification as a response variable in order to

estimate θ.

• By contrast, the relative risk requires this, and its

value also depends on whether it is applied to the

first or to the second outcome category.

(3)

Both variables are response variables

The odds ratio is also called the cross-product ratio, because it equals the

ratio of the products π11π22 and π12π21 of cell probabilities from diagonally opposite cells.

The sample odds ratio equals the ratio of the sample odds in the two rows The sample odds ratio equals the ratio of the sample odds in the two rows,

5

Ilustasi:

kasus aspirin dan serangan jantung

This also equals the cross-product ratio (189 × 10, 933)/(10,845 × 104). 11 1 12 189 0.0174 10845 n odds n = = = 21 2 22 104 0.0095 10933 n odds n = = = 1 2 0.0174 1.832 0.0095 Odds OR Odds θ = = = = )

The estimated odds were 83% higher for the placebo

(4)

Inferensia Rasio Odds

dan Log Rasio Odds

• Kecuali pada ukuran sampel sangat

besar, sebaran percontohan dari OR (hi hl k d) sangat menceng (highly skewed). • Karena kemiringan ini, statistika

inferensia untuk rasio odds menggunakan alternatif

dengan ukuran yang setara - logaritma natural, log (θ). Dengan log (θ)=0. • Artinya θ =1 setara dengan log (θ) dari

0.

7

• Log(OR) simetrik di sekitar nilai 0.

• Artinya, jika kita menukar posisi baris dan kolom akan

mengubah tandanya. Misal: log(2.0) = 0.7 dan log(0.5) = −0.7, k d il i i i kili k k t i i

kedua nilai ini mewakili kekuatan asosiasi yang sama

• Doubling a log odds ratio corresponds to squaring an odds

ratio.

• Sebaran dari log(θ) tidak terlalu menceng, menyerupai bentuk lonceng

lonceng

• Sebaran log (θ) mendekati sebaran normal dengan nilai tengah log(θ) dan galat baku

8 The SE decreases as the cell

(5)

Selang Kepercayaan untuk log(

θ)

( )

2

ˆ

log

θ

±

Z

α

SE

Ilustrasi: data aspirin

• log(1.832) = 0.605

• Galat baku =

• SK 95% untuk log (θ)

0.605 ± 1.96(0.123)

Ù (0 365 0 846)

9

Ù (0.365, 0.846)

• SK 95% untuk

θ

Ù[exp(0.365), exp(0.846)] = (e

0.365

_{, e}

0.846

_{) = (1.44, 2.33)}

• karena θ tidak mengandung 1, kemungkinan

serangan jantung berbeda untuk kedua kelompok.

Kita menduga bahwa odds serangan jantung setidaknya 44% lebih tinggi

j g y gg

pada subjek yang mengkonsumsi placebo dibandingkan dengan subjek yang mengkonsumsi aspirin

(6)

Catatan

• Bila terdapat nilai n

_ij

=0, maka perhitungan OR

d l h

adalah

11

Hubungan antara OR dan RR

Jika p1 dan p2 mendekati nol, maka nilai OR akan sama dgr RR

12

This relationship between the odds ratio and the relative risk isuseful. For some data setsdirect estimation of the relative risk is not possible, yet one can estimate the odds ratio and use it to approximate the relative risk.

(7)

Rasio Odds pada studi case-control

• Table 2.4 refers to a study that investigated the relationship between smoking and myocardial infarction.

• The first column refers.

• Each case was matched with two

to 262 young and middle-aged women (age < 69) admitted to 30

coronary care units in northern Italy with acute MI during a 5-year period

control patients admitted to the same hospitals with other acute disorders.

• The controls fall in the second 13

with acute MI during a 5 year period

• All subjects were classified according to whether they had ever been smokers.

• The “yes” group consists of women who were current smokers or ex-smokers, whereas the “no” group consists of women who never were smokers.We refer to this variableas smoking status.

• The study, which uses a retrospective design to look into the past, is called a case–control study.

• Such studies are common in health-related applications, for instance to ensure a sufficiently large sample ofsubjects having the disease studied.

(8)

Tidak bisa menghitung proporsi penderita MI pada kelompok smoker

(atau non-smoker) Peubah respon

penjelas

Karena untuk setiap penderita MI kita pasangkan dengan 2

orang kontrol

Pe

ubah

When the sampling design is When the sampling design is

retrospective

, we can construct

conditional distributions

15

Untuk wanita penderita MI, proporsi yang merupakan perokok sebesalr172/262 = 0.656,

Sedangkan untuk wanita bukan penderita MI, proporsi perokok sebesar 173/519 = 0.333 levels of the fixed response.

for the

explanatory variable

, within levels of the fixed response.

• In Table 2.4, the sample odds ratio is [0.656/(1 −

0.656)]/[0.333/(1 − 0.333)] = (172 × 346)/(173 ×

90) = 3.8.

• The estimated odds of ever being a smoker were

f

g

about 2 for the MI cases (i.e., 0.656/0.344) and

about 1/2 for the controls (i.e.,0.333/0.667),

yielding an odds ratio of about 2/(1/2) = 4.

• For Table 2.4, we cannot estimate the relative risk

of MI or the difference of proportions suffering

MI

MI.

• Binomial sample Æ column, dependent because

1MI paired with 2 control

(9)

Types of Observational study

T

!!

Tugas!!

Cari tahu macam2 tipe studi

observasi beserta penjelasan dan

contohnya!!

17

Bagaimana mengukur keeratan

hubungan 2 peubah??

K

l i

Korelasi

_{Hubungan linear}

Data

pearson

spearman

(10)

Tahun 1900

19

Pearson

chi-squared statistic

Karl Pearson

Uji Kebebasan Khi - Kuadrat

• Mengukur asosiasi antara dua peubah.

• Korelasi Pearson and Spearman tidak dapat

diterapkan pada data degan skala pengukuran

nominal

• Khi-kuadrat digunakan untuk data nominal dalam

tabel kontingensi

A contingency table is a two-way table showing the contingency between two variables where the variables have been classified into mutually exclusive categories and the cell entries are frequencies.

(11)

Statistik Uji (pearson chi-squared &

likelihood chi squared)

• Pearson statistic X2 is a score statistic. (This means that X2 is based on a covariance

matrix for the counts that is estimated under H0.)

• The Pearson X2 and likelihood-ratio G2 provide separate test statistics, but they share many properties and usually provide the same conclusions.

(12)

• The convergence is quicker for X2 than G2.

• The chi-squared approximation is often poor

for G2 when some expected frequencies are

less than about 5.

23

Party Identification

Menghitung Nilai Harapan

Dem

ocrat

Independent

_Republic

an

Total

Females

762

327

468 1577

Males

484

293

477 1200

703,7

Males

484

293

477 1200

Total

1246 566

945 2757

1. 1246*1577= 1940022 2. 1940022/2757 = 703,7

(13)

25

Ilustrasi: Data smoker-lung cancer

Lung Cancer

Total

Yes

No

Smoker

120

30

150 Non

Smoker

40

50

90 Total

160

80

240 Total

160

80

240

(14)

Hipotesis

H

₀

: Tidak ada asosiasi antara kebiasaan merokok

dan penyakit kanker paru-paru

H : Ada asosiasi antara kebiasaan merokok dan

H

₁

: Ada asosiasi antara kebiasaan merokok dan

penyakit kanker paru-paru

Nilai Rasio Odds

(120 50)

5 (40 30)

x

θ

=

27

Syntax SAS

Data aspirin;

input smoking $ cancer $ frec ; cards; smoker yes 120 smoker no 30 non_smoker yes 40 non_smoker no 50 ;

proc freq data=aspirin order=data;

p q p

tables smoking*cancer/nopercent nocol norow expected; exact or chisq;

weight frec; run;

(15)

Output

(16)

31

Mengubah posisi tabel kontingensi

(17)

33

Warning !!

Lebih dari 20% cell dengan nilai harapan > 5, kita tidak bisa menggunakan Chi Square test

Dua Solusi:

1. Menggabungkan kategori 2. Gunakan Exact Fisher test

(18)

Menggabungkan Kategori

Daya Listik

_Penghasilan

Total 300 000 1 000 000 >300.000-750.000 > 1.000.000-2.000.000 450 & 900 watt 37 11 48 1300 & 3500 watt 2 10 12 Total 39 21 50 35