Tabel Kontingensi 2x2 (3)
Rasio Odds dan Uji Kebebasan
Khi-Kuadrat
Rasio Odds
Exposure
outcome
4
Association
Odds Ratio
• most commonly used in case-control studies,
• can also be used in cross-sectional and cohort study
designs as well (with some modifications and/or
assumptions).
Odds Ratio
odds that an outcome will occur given a particular exposure
odds of the outcome occurring in the absence of that exposure
Rasio ODDS
Odds Sukses
odds
1
• Odds bernilai positif
• Nilai odss lebih besar dari satu, saat “sukses” lebih dipilih
dibandingkan “gagal”
• odds = 4.0, a success is four times as likely as a failure
“It occurs as a parameter in the most important type of
model for categorical data”
Rasio Odds Pada Tabel 2x2
A1
A2
B1
π
1
1-π
1
B2
π
2
1-π
2
1
1 11
odds
2
2 21
odds
Rasio OddsRASIO ODDS pada Study Cohort
8
Develop
Disease
Develop
Do Not
Disease
Exposed
a
b
Non-Exposed
c
d
The Odds that an exposed person develop disease
a
b
The Odds that a non exposed person develop disease
c
d
Rasio Odds : Cohort
• Odds ratio is the ratio of the odds of disease in
the exposed to the odds of disease in the
non-exposed
odds that an exposed person develops the disease
odds that a non exposed person develops the disease
ab c
d
OR
RASIO ODDS pada Study Case-Control
10
Case
Control
History of Exposure
a
b
No History of Exposure
c
d
The odds that a case was exposed
a
c
The odds that a control was exposed
b
d
Rasio Odds : Cohort
odds that a case was exposed
odds that a control was exposed
a
c
b
d
OR
Odds ratio (OR) is the ratio of the odds that a case was
exposed to the odds that a control was exposed
Properties of OR
• The odds ratio
does not change value
when the
table orientation reverses
so that the rows become
the columns and the columns become the rows.
• Thus, it is unnecessary to identify one classification
as a response variable in order to estimate θ.
• By contrast, the relative risk requires this, and its
value also depends on whether it is applied to the
first or to the second outcome category.
Both variables are response variables
The odds ratio is also called the cross-product ratio, because it equals the
ratio of the products π11π22 and π12π21 of cell probabilities from
diagonally opposite cells.
Ilustasi:
kasus aspirin dan serangan jantung
11 1 12
189
0.0174
10845
n
odds
n
21 2 22104
0.0095
10933
n
odds
n
1 20.0174 1.832
0.0095
Odds
OR
Odds
This also equals the
cross-product ratio (189 × 10, 933)/(10,845 × 104).
The estimated odds were 83% higher for the placebo group. 14
Inferensia Rasio Odds
dan Log Rasio Odds
• Kecuali pada ukuran sampel
sangat besar, sebaran
percontohan dari OR sangat
menceng (highly skewed).
• Karena kemiringan ini, statistika
inferensia untuk rasio odds
menggunakan alternatif
dengan ukuran yang setara
-logaritma natural, log (θ). Dengan
log (θ)=0.
• Artinya =1 setara dengan log ()
dari 0.
• Log(OR) simetrik di sekitar nilai 0.
• Artinya, jika kita menukar posisi baris dan kolom akan
mengubah tandanya. Misal: log(2.0) = 0.7 dan log(0.5) = −0.7,
kedua nilai ini mewakili kekuatan asosiasi yang sama
• Doubling a log odds ratio corresponds to squaring an odds
ratio.
• Sebaran dari log() tidak terlalu menceng, menyerupai bentuk
lonceng
• Sebaran log () mendekati sebaran normal dengan nilai
tengah log() dan galat baku
16
The SE decreases as the cell counts increase.
Selang Kepercayaan untuk log()
2
ˆ
log
Z SE
Ilustrasi: data aspirin
• log(1.832) = 0.605
• Galat baku =
• SK 95% untuk log ()
0.605 ± 1.96(0.123)
(0.365, 0.846)
• SK 95% untuk
[exp(0.365), exp(0.846)] = (e
0.365
, e
0.846
) = (1.44, 2.33)
18
Kita menduga bahwa odds serangan
jantung setidaknya 44% lebih tinggi
pada subjek yang mengkonsumsi
placebo dibandingkan dengan
Catatan
• Bila terdapat nilai n
ij
=0, maka perhitungan OR
adalah
Hubungan antara OR dan RR
Jika p1 dan p2 mendekati nol, maka nilai OR akan sama dgr RR
20
This relationship between the odds ratio and the relative risk is
useful.
For some data sets
direct estimation of the relative risk is not possible
,
yet one can estimate the odds ratio and use it to approximate the
relative risk.
Rasio Odds pada studi case-control
• Table 2.4 refers to a study that
investigated the relationship between
smoking and myocardial infarction.
• The first column refers.
• Each case was matched with two
control patients admitted to the same
hospitals with other acute disorders.
• The controls fall in the second column
of the table.
to 262 young and middle-aged women (age < 69) admitted to 30 coronary care units in northern Italy with acute MI during a 5-year period
• All subjects were classified according to whether they had ever
been smokers.
• The “yes” group consists of women who were current smokers or
ex-smokers, whereas the “no” group consists of women who never
were smokers.We refer to this variableas smoking status.
• The study, which uses a retrospective design to look into the past, is
called a case–control study.
• Such studies are common in health-related applications, for
instance to ensure a sufficiently large sample ofsubjects having the
disease studied.
Tidak bisa menghitung proporsi penderita MI pada kelompok smoker
(atau non-smoker)
Karena untuk setiap penderita MI kita pasangkan dengan 2
orang kontrol
Untuk wanita penderita MI, proporsi yang merupakan perokok sebesalr172/262 = 0.656, Peubah respon Pe ub ah p en je las
When the sampling design is
levels of the fixed response.
When the sampling design is
retrospective
, we can construct
conditional distributions
for the
explanatory variable
, within
levels of the fixed response.
• In Table 2.4, the sample odds ratio is [0.656/(1 −
0.656)]/[0.333/(1 − 0.333)] = (172 × 346)/(173 ×
90) = 3.8.
• The estimated odds of ever being a smoker were
about 2 for the MI cases (i.e., 0.656/0.344) and
about 1/2 for the controls (i.e.,0.333/0.667),
yielding an odds ratio of about 2/(1/2) = 4.
• For Table 2.4, we cannot estimate the relative risk
of MI or the difference of proportions suffering
MI.
• Binomial sample column, dependent because
1MI paired with 2 control
Bagaimana mengukur keeratan
hubungan 2 peubah??
Korelasi
Hubungan linearpearson
spearman
Data
Nominal ?
Tahun 1900
26
Pearson
chi-squared statistic
Uji Kebebasan Khi - Kuadrat
• Mengukur asosiasi antara dua peubah.
• Korelasi Pearson and Spearman tidak dapat
diterapkan pada data degan skala pengukuran
nominal
• Khi-kuadrat digunakan untuk data nominal dalam
tabel kontingensi
A contingency table is a two-way table showing the contingency between two variables where the variables have been classified into mutually
Statistik Uji (pearson chi-squared &
likelihood chi squared)
• Pearson statistic X2 is a score statistic. (This means that X2 is based on a covariance matrix for the counts that is estimated under H0.)
• The Pearson X2 and likelihood-ratio G2 provide separate test statistics, but they share many properties and usually provide the same conclusions.
• The convergence is quicker for X2 than G2.
• The chi-squared approximation is often poor
for G2 when some expected frequencies are
less than about 5.
Party Identification
Dem
ocrat
Independent
Republic
an
Total
Females
762
327
468
1577
Males
484
293
477
1200
Total
1246 566
945
2757
Menghitung Nilai Harapan
Ilustrasi: Data smoker-lung cancer
Lung Cancer
Total
Yes
No
Smoker
120
30
150
Non
Smoker
40
50
90
Hipotesis
H
0
: Tidak ada asosiasi antara kebiasaan merokok
dan penyakit kanker paru-paru
H
1
: Ada asosiasi antara kebiasaan merokok dan
penyakit kanker paru-paru
Nilai Rasio Odds
34
(120 50) 5
(40 30)
x
x
Syntax SAS
Data aspirin;
input smoking $ cancer $ frec ;
cards;
smoker yes 120
smoker no 30
non_smoker yes 40
non_smoker no 50
;
proc freq data=aspirin order=data;
tables smoking*cancer/nopercent nocol norow expected;
exact or chisq;
weight frec;
run;
Output
Warning !!
Lebih dari 20% cell dengan nilai