Tabel Kontingensi 2x2 (3)
Rasio Odds dan Uji Kebebasan
Khi-K d t
Kuadrat
Rasio ODDS
“It occurs as a parameter in the most important type of model for categorical data”
Odds Sukses
(
1)
odds π π = −• Odds bernilai positif
• Nilai odss lebih besar dari satu, saat “sukses” lebih dipilih dibandingkan “gagal”
dibandingkan “gagal”
Rasio Odds Pada Tabel 2x2
A1
A2
B1
π
11-π
1B2
π
21-π
2(
1)
1 1 1 odds π π = −(
2)
2 2 1 odds π π = − Rasio OddsValues of θ farther from 1.0 in a given direction represent stronger
association.
3
Properties of OR
•
The odds ratio
does not change value
when the
table orientation reverses
so that the rows
table orientation reverses
so that the rows
become the columns and the columns become
the rows.
•
Thus, it is unnecessary to identify one
classification as a response variable in order to
estimate θ.
•
By contrast, the relative risk requires this, and its
value also depends on whether it is applied to the
first or to the second outcome category.
Both variables are response variables
The odds ratio is also called the cross-product ratio, because it equals the
ratio of the products π11π22 and π12π21 of cell probabilities from diagonally opposite cells.
The sample odds ratio equals the ratio of the sample odds in the two rows The sample odds ratio equals the ratio of the sample odds in the two rows,
5
Ilustasi:
kasus aspirin dan serangan jantung
This also equals the cross-product ratio (189 × 10, 933)/(10,845 × 104). 11 1 12 189 0.0174 10845 n odds n = = = 21 2 22 104 0.0095 10933 n odds n = = = 1 2 0.0174 1.832 0.0095 Odds OR Odds θ = = = = )
The estimated odds were 83% higher for the placebo
Inferensia Rasio Odds
dan Log Rasio Odds
• Kecuali pada ukuran sampel sangatbesar, sebaran percontohan dari OR (hi hl k d) sangat menceng (highly skewed). • Karena kemiringan ini, statistika
inferensia untuk rasio odds menggunakan alternatif
dengan ukuran yang setara - logaritma natural, log (θ). Dengan log (θ)=0. • Artinya θ =1 setara dengan log (θ) dari
0.
7
• Log(OR) simetrik di sekitar nilai 0.
• Artinya, jika kita menukar posisi baris dan kolom akan
mengubah tandanya. Misal: log(2.0) = 0.7 dan log(0.5) = −0.7, k d il i i i kili k k t i i
kedua nilai ini mewakili kekuatan asosiasi yang sama
• Doubling a log odds ratio corresponds to squaring an odds
ratio.
• Sebaran dari log(θ) tidak terlalu menceng, menyerupai bentuk lonceng
lonceng
• Sebaran log (θ) mendekati sebaran normal dengan nilai tengah log(θ) dan galat baku
8 The SE decreases as the cell
Selang Kepercayaan untuk log(
θ)
( )
2
ˆ
log
θ
±
Z
αSE
Ilustrasi: data aspirin
•
log(1.832) = 0.605
•
Galat baku =
•
SK 95% untuk log (θ)
0.605 ± 1.96(0.123)
Ù (0 365 0 846)
9Ù (0.365, 0.846)
•
SK 95% untuk
θ
Ù[exp(0.365), exp(0.846)] = (e
0.365, e
0.846) = (1.44, 2.33)
•
karena θ tidak mengandung 1, kemungkinan
serangan jantung berbeda untuk kedua kelompok.
Kita menduga bahwa odds serangan jantung setidaknya 44% lebih tinggi
j g y gg
pada subjek yang mengkonsumsi placebo dibandingkan dengan subjek yang mengkonsumsi aspirin
Catatan
•
Bila terdapat nilai n
ij=0, maka perhitungan OR
d l h
adalah
11
Hubungan antara OR dan RR
Jika p1 dan p2 mendekati nol, maka nilai OR akan sama dgr RR
12
This relationship between the odds ratio and the relative risk isuseful. For some data setsdirect estimation of the relative risk is not possible, yet one can estimate the odds ratio and use it to approximate the relative risk.
Rasio Odds pada studi case-control
• Table 2.4 refers to a study that investigated the relationship between smoking and myocardial infarction.
• The first column refers.
• Each case was matched with two
to 262 young and middle-aged women (age < 69) admitted to 30
coronary care units in northern Italy with acute MI during a 5-year period
control patients admitted to the same hospitals with other acute disorders.
• The controls fall in the second 13
with acute MI during a 5 year period
• All subjects were classified according to whether they had ever been smokers.
• The “yes” group consists of women who were current smokers or ex-smokers, whereas the “no” group consists of women who never were smokers.We refer to this variableas smoking status.
• The study, which uses a retrospective design to look into the past, is called a case–control study.
• Such studies are common in health-related applications, for instance to ensure a sufficiently large sample ofsubjects having the disease studied.
Tidak bisa menghitung proporsi penderita MI pada kelompok smoker
(atau non-smoker) Peubah respon
penjelas
Karena untuk setiap penderita MI kita pasangkan dengan 2
orang kontrol
Pe
ubah
When the sampling design is When the sampling design is
retrospective
, we can constructconditional distributions
15
Untuk wanita penderita MI, proporsi yang merupakan perokok sebesalr172/262 = 0.656,
Sedangkan untuk wanita bukan penderita MI, proporsi perokok sebesar 173/519 = 0.333 levels of the fixed response.
for the
explanatory variable
, within levels of the fixed response.•
In Table 2.4, the sample odds ratio is [0.656/(1 −
0.656)]/[0.333/(1 − 0.333)] = (172 × 346)/(173 ×
90) = 3.8.
•
The estimated odds of ever being a smoker were
f
g
about 2 for the MI cases (i.e., 0.656/0.344) and
about 1/2 for the controls (i.e.,0.333/0.667),
yielding an odds ratio of about 2/(1/2) = 4.
•
For Table 2.4, we cannot estimate the relative risk
of MI or the difference of proportions suffering
MI
MI.
•
Binomial sample Æ column, dependent because
1MI paired with 2 control
Types of Observational study
T
!!
Tugas!!
Cari tahu macam2 tipe studi
observasi beserta penjelasan dan
contohnya!!
17
Bagaimana mengukur keeratan
hubungan 2 peubah??
K
l i
Korelasi
Hubungan linearData
pearson
spearman
Tahun 1900
19
Pearson
chi-squared statistic
Karl Pearson
Uji Kebebasan Khi - Kuadrat
•
Mengukur asosiasi antara dua peubah.
•
Korelasi Pearson and Spearman tidak dapat
•
Korelasi Pearson and Spearman tidak dapat
diterapkan pada data degan skala pengukuran
nominal
•
Khi-kuadrat digunakan untuk data nominal dalam
tabel kontingensi
A contingency table is a two-way table showing the contingency between two variables where the variables have been classified into mutually exclusive categories and the cell entries are frequencies.
Statistik Uji (pearson chi-squared &
likelihood chi squared)
• Pearson statistic X2 is a score statistic. (This means that X2 is based on a covariance
matrix for the counts that is estimated under H0.)
• The Pearson X2 and likelihood-ratio G2 provide separate test statistics, but they share many properties and usually provide the same conclusions.
•
The convergence is quicker for X2 than G2.
•
The chi-squared approximation is often poor
for G2 when some expected frequencies are
less than about 5.
23
Party Identification
Menghitung Nilai Harapan
Dem
ocrat
IndependentRepublic
an
Total
Females
762
327
468
1577
Males
484
293
477
1200
703,7Males
484
293
477
1200
Total
1246 566
945
2757
1. 1246*1577= 1940022 2. 1940022/2757 = 703,725
Ilustrasi: Data smoker-lung cancer
Lung Cancer
Total
Yes
No
Smoker
120
30
150
Non
Smoker
40
50
90
Total
160
80
240
Total
160
80
240
Hipotesis
H
0: Tidak ada asosiasi antara kebiasaan merokok
dan penyakit kanker paru-paru
H : Ada asosiasi antara kebiasaan merokok dan
H
1: Ada asosiasi antara kebiasaan merokok dan
penyakit kanker paru-paru
Nilai Rasio Odds
(120 50)
5
(40 30)
x
x
θ
=
=
27Syntax SAS
Data aspirin;input smoking $ cancer $ frec ; cards; smoker yes 120 smoker no 30 non_smoker yes 40 non_smoker no 50 ;
proc freq data=aspirin order=data;
p q p
tables smoking*cancer/nopercent nocol norow expected; exact or chisq;
weight frec; run;
Output
31
Mengubah posisi tabel kontingensi
33
Warning !!
Lebih dari 20% cell dengan nilai harapan > 5, kita tidak bisa menggunakan Chi Square test
Dua Solusi:
1. Menggabungkan kategori 2. Gunakan Exact Fisher test
Menggabungkan Kategori
Daya Listik