• Tidak ada hasil yang ditemukan

Sesi 3. Population, Parameter, Statistics and the Probability Application on Statistics Distribution

N/A
N/A
Protected

Academic year: 2018

Membagikan "Sesi 3. Population, Parameter, Statistics and the Probability Application on Statistics Distribution"

Copied!
72
0
0

Teks penuh

(1)

Basi c Probabi l i ty

Prof. dr. Siswanto Agus Wilopo, M.Sc., Sc.D.

Department of Biostatistics, Epidemiology and

Population Health

Faculty of Medicine

(2)

Learni ng Obj ecti ves

In this lecture, you learn:

Basic probability concepts and

definitions

Conditional probability

To use Bayes’ Theorem to revise

probabilities

(3)

Important Terms

Probability

– the chance that an

uncertain event will occur (always

between 0 and 1)

Event

– Each possible outcome of a

variable

Simple Event

– an event that can be

described by a single characteristic

Sample Space

– the collection of all

(4)

Assessi ng Probabi l i ty

There are three approaches to assessing the

probability of an uncertain event:

1.

a priori

classical probability

2.

empirical classical probability

3.

subjective probability

an individual judgment or opinion about

the probability of occurrence

outcomes

probabilit

observed

(5)

Sample Space

The

Sample Space

is the collection of all

possible events

e.g. All 6 faces of a die:

(6)

Events

Simple event

An outcome from a sample space with one

characteristic

e.g., A red card from a deck of cards

Complement of an event A (denoted A’)

All outcomes that are not part of event A

e.g., All cards that are not diamonds

Joint event

Involves two or more characteristics simultaneously

(7)

Vi sual i zi ng Events

Contingency Tables

Tree Diagrams

Red 2 24 26

Black 2 24 26

Total 4 48 52

Ace Not Ace Total

Full Deck

of 52 Cards

Sample

Space

Sample

Space

2

24

(8)

Vi sual i zi ng Events

Venn Diagrams

Let A = aces

Let B = red cards

A

B

A

B = ace and red

(9)

Mutual l y Excl usi ve Events

Mutually exclusive

events

Events that cannot occur together

example:

A = queen of diamonds; B = queen of clubs

(10)

Col l ecti vel y Exhausti ve Events

Collectively exhaustive

events

One of the events must occur

The set of events covers the entire sample

space

example:

A = aces; B = black cards;

C = diamonds; D = hearts

Events A, B, C and D are collectively

exhaustive (but not mutually exclusive – an

ace may also be a heart)

(11)

Probabi l i ty

Probability is the numerical

measure of the likelihood that an

event will occur

The probability of any event must

be between 0 and 1, inclusively

The sum of the probabilities of all

mutually exclusive and

collectively exhaustive events is 1

Certain

Impossible

0.5

1

0

0

P(A)

1

For any event A

1

P(C)

P(B)

P(A)

(12)

Computi ng Joi nt and

Margi nal Probabi l i ti es

The probability of a joint event, A and B:

Computing a marginal (or simple)

probability:

Where B

1

, B

2

, …, B

k

are k mutually exclusive and collectively

exhaustive events

(13)

Joi nt Probabi l i ty Exampl e

P(

Red

and

Ace

)

Black

Color

Type

Red

Total

Ace

2

2

4

Non-Ace

24

24

48

Total

26

26

52

52

2

cards

of

number

total

ace

and

red

are

that

cards

of

number

(14)

Margi nal Probabi l i ty Exampl e

P(

Ace

)

Black

Color

Type

Red

Total

Ace

2

2

4

Non-Ace

24

24

48

(15)

P(A

1

and B

2

)

P(A

1

)

Total

Event

Joi nt Probabi l i ti es Usi ng

Conti ngency Tabl e

P(A

2

and B

1

)

P(A

1

and B

1

)

Event

Total

1

Joint Probabilities

Marginal (Simple) Probabilities

A

1

A

2

B

1

B

2

P(B

1

)

P(B

2

)

(16)

General Addi ti on Rul e

P(A or B) = P(A) + P(B) - P(A and B)

General Addition Rule:

If A and B are mutually exclusive

, then

P(A and B) = 0, so the rule can be simplified:

P(A or B) = P(A) + P(B)

(17)

General Addi ti on Rul e Exampl e

P(

Red

or

Ace

) = P(

Red

) +P(

Ace

) - P(

Red

and

Ace)

=

26

/52 +

4

/52 -

2

/52 = 28/52

Don’t count

the two red

aces twice!

Black

Color

Type

Red

Total

Ace

2

2

4

Non-Ace

24

24

48

(18)

Computi ng Condi ti onal

Probabi l i ti es

A

conditional probability

is the probability of one

event, given that another event has occurred:

P(B)

P(A) = marginal probability of A

P(B) = marginal probability of B

The conditional

probability of A given

that B has occurred

The conditional

(19)

What is the probability that a car has a

CD player, given that it has AC ?

i.e., we want to find

P(CD | AC)

Condi ti onal Probabi l i ty Exampl e

(20)

Condi ti onal Probabi l i ty Exampl e

No CD

CD

Total

AC

0.2

0.5

0.7

No AC

0.2

0.1

0.3

Total

0.4

0.6

1.0

Of the cars on a used car lot,

70%

have air conditioning

(AC) and

40%

have a CD player (CD).

20%

of the cars have both

.

0.2857

0.7

0.2

P(AC)

AC)

and

P(CD

AC)

|

P(CD

(21)

Condi ti onal Probabi l i ty Exampl e

No CD

CD

Total

AC

0.2

0.5

0.7

No AC

0.2

0.1

0.3

Total

0.4

0.6

1.0

Given AC

, we only consider the top row (70% of the cars). Of these,

20% have a CD player. 20% of 70% is about 28.57%.

0.2857

0.7

0.2

P(AC)

AC)

and

P(CD

AC)

|

P(CD

(22)

Usi ng Deci si on Trees

P(AC and CD) = 0.2

P(AC and CD’) = 0.5

P(AC

and CD

) = 0.1

P(AC

and CD) = 0.2

7

.

5

.

3

.

2

.

1

.

All

Cars

7

.

2

.

(23)

Usi ng Deci si on Trees

P(CD and AC) = 0.2

P(CD and AC’) = 0.2

P(CD

and AC

) = 0.1

P(CD

and AC) = 0.5

4

.

2

.

6

.

5

.

1

.

All

Cars

4

.

2

.

Given CD or

no CD:

(24)

Stati sti cal Independence

Two events are

independent

if and

only if:

Events A and B are independent when the

probability of one event is not affected by the

other event

P(A)

B)

|

(25)

Mul ti pl i cati on Rul es

Multiplication rule for two events A

and B:

P(B)

B)

|

P(A

B)

and

P(A

P(A)

B)

|

P(A

Note:

If A and B are independent

, then

and the multiplication rule simplifies to

P(B)

P(A)

B)

and

(26)

Bayes’ Theorem

where:

B

i

= i

th

event of k mutually exclusive and

collectively exhaustive events

A = new event that might impact P(B

i

)

(27)

Margi nal Probabi l i ty

Marginal probability for event A:

Where B

1

, B

2

, …, B

k

are k mutually exclusive and

collectively exhaustive events

)

P(B

)

B

|

P(A

)

P(B

)

B

|

P(A

)

P(B

)

B

|

P(A

(28)

Practi ce probl em

If HIV has a prevalence of 3% in

Jayapura, and a particular HIV test

has a false positive rate of .001 and a

false negative rate of .01, what is the

probability that a random person

(29)

Answer

______________

1.0

P (+, test +)=.0297

P(+, test -)=.003

P(-, test +)=.00097

P(-, test -) = .96903

P(test +)=.0297+.00097=.03067

Marginal probability of carrying

the virus.

Joint probability of being + and

testing +

P(+&test+)

P(+)*P(test+)

.0297

.03*.03067 (=.00092)

Marginal probability of testing

positive

Conditional probability: the

probability of testing + given that

a person is +

P(+)=.03

P(-)=.97

P(test +)=.99

P(test - )= .01

P(test +) = .001

(30)

Law of total probabi l i ty

)P(HIV-)

/HIV

P(test

)

)P(HIV

/HIV

P(test

)

P(test

.97)

(

001

.

)

03

(.

99

.

)

P(test

(31)

Law of total probabi l i ty

Formal Rule: Marginal probability for event

A=

exclusive)

(mutually

(32)

Exampl e 2

A 54-year old woman has an abnormal

mammogram; what is the chance that

she has breast cancer?

Mammogram sensitivity=0.90 and

(33)

Exampl e: Mammography

______________

1.0

P(test +)=.90

P(BC+)=.003

P(BC-)=.997

P(test -) = .10

P(test +) = .11

P (+, test +)=.0027

P(+, test -)=.0003

P(-, test +)=.10967

P(-, test -) = .88733

P(test -) = .89

Marginal probabilities of breast cancer….(prevalence

among all 54-year olds)

sensitivity

specificity

(34)
(35)

Bayes’ Rul e: deri vati on

)

(

)

&

(

)

/

(

B

P

B

A

P

B

A

P

Definition:

Let A and B be two events with P(B)

0.

The conditional probability of A given B

is:

(36)

Bayes’ Rul e: deri vati on

can be re-arranged to:

)

and, since also:

(37)

Bayes’ Rul e:

Total Probability”

(38)

Bayes’ Rul e:

Why do we care??

Why is Bayes’ Rule useful??

It turns out that sometimes it is very

(39)

In-Cl ass Exerci se

If HIV has a prevalence of 3% in Jayapura,

and a particular HIV test has a false positive

rate of .001 and a false negative rate of .01,

what is the probability that a random

person who tests positive is actually

(40)

Answer: usi ng probabi l i ty tree

______________ 1.0

P(test +)=.99

P(+)=.03

P(-, test +)=.00097

P(-, test -) = .96903 P(test -) = .999

A positive test places one on either of the two “test +” branches.

But only the top branch also fulfills the event “true infection.”

Therefore, the probability of being infected is the probability of being on the top

branch given that you are on one of the two circled branches above.

(41)
(42)

In-cl ass exerci se

An insurance company believes that drivers can be

divided into two classes—those that are of high risk

and those that are of low risk. Their statistics show

that a high-risk driver will have an accident at some

time within a year with probability .4, but this

probability is only .1 for low risk drivers.

a)

Assuming that 20% of the drivers are high-risk, what is the

probability that a new policy holder will have an accident

within a year of purchasing a policy?

b)

If a new policy holder has an accident within a year of

(43)

Answer to (a)

Assuming that 20% of the drivers are of

high-risk, what is the probability that a new

policy holder will have an accident within a

year of purchasing a policy?

Use law of total probability:

P(accident)=

(44)

Answer to (b)

If a new policy holder has an accident within a year of

purchasing a policy, what is the probability that he is a high-risk

type driver?

P(high-risk/accident)=

P(accident/high risk)*P(high risk)/P(accident)

=.40(.20)/.16 = 50%

Or use tree:

P(accident/LR)=.1

______________

1.0 P( no acc/HR)=.6

P(accident/HR)=.4 P(high risk)=.20

P(accident, high risk)=.08

P(no accident, high risk)=.12) P(accident, low risk)=.08 P(low risk)=.80

P( no

accident/LR)=.9

P(no accident, low risk)=.72

(45)

Condi ti onal Probabi l i ty for

Epi demi ol ogy:

(46)

The Ri sk Rati o and the Odds Rati o as

condi ti onal probabi l i ty

In epidemiology, the association between a

risk factor or protective factor (exposure)

and a disease may be evaluated by the “risk

ratio” (RR) or the “odds ratio” (OR).

Both are measures of “relative risk”—the

(47)

Odds and Ri sk (probabi l i ty)

Definitions:

Risk = P(A) = cumulative probability (you specify the time

period!)

For example, what’s the probability that a person with a high

sugar intake develops diabetes in 1 year, 5 years, or over a

lifetime?

Odds = P(A)/P(~A)

For example, “the odds are 3 to 1 against a horse” means that

the horse has a 25% probability of winning.

(48)

Odds vs. Ri sk=probabi l i ty

If the risk is…

Then the odds are…

½ (50%)

¾ (75%)

1/ 10 (10%)

1/ 100 (1%)

Note: An odds is always higher than its corresponding probability,

unless the probability is 100%.

1:1

3:1

1:9

(49)

Introducti on to the 2x2 Tabl e

Exposure (E)

No Exposure (~E)

Disease (D)

a

b

a+b = P(D)

No Disease (~D)

c

d

c+d = P(~D)

a+c = P(E)

b+d = P(~E)

Marginal probability of

disease

(50)

Cohort Studi es

Target

population

Exposed

Not

Exposed

Disease-free

cohort

Disease

Disease-free

Disease

Disease-free

(51)

Exposure (E)

No Exposure (~E)

risk to the exposed

risk to the unexposed

(52)

400

400

1100

2600

0

.

2

3000

/

400

400

/

1500

RR

Hypotheti cal Data

Normal BP

Congestive

Heart Failure

No CHF

(53)

The odds rati o…

This risk ratio seems like the perfect measure

of relative risk. Why not stop here? Why

introduce the more complicated odds ratio??

We cannot calculate a risk ratio from a

case-control study. Case-case-control studies are a

popular study design in epidemiology,

because they are useful for studying rare

diseases.

In a case-control study, we sample

(54)

Target

population

Exposed in

past

Not exposed

Exposed

Not Exposed

Case-Control Studi es

Disease

(Cases)

No Disease

(55)

Hep C +

Hep C

-Cases: Liver

cancer

90

10

Controls

30

70

The Odds Ratio (OR)

What are P(D/E) and P(D/~E) here?

We can’t tell, because, by design, we have fixed the

proportion of liver cancer cases in this sample at 50% simply

by selecting half controls and half cases.

All these data give us is: P(E/D) and P(E/~D).

(56)

Hep C +

Hep C

-Cases: Liver

cancer

90

10

Controls

30

70

The Odds Ratio (OR)

Luckily, P(E/D) [=the quantity you have]

(

/

)

(

/

(

)

)

(

)

D

P

E

P

E

D

P

D

E

P

by Bayes’ Rule…

Unfortunately, our sampling scheme precludes calculation of the marginals: P(E) and P(D), but turns out we

don’t need these if we use an odds ratio because the marginals cancel out!

(57)

=

Odds of exposure in the controls Odds of exposure in the cases

Bayes’ Rule

Odds of disease in the unexposed Odds of disease in the exposed

What we want!

(58)

d

b

c

a

d

c

b

a

bc

ad

OR

Exposure (E)

No Exposure

(~E)

Disease (D)

a

b

No Disease (~D)

c

d

The Odds Ratio (OR)

Odds of disease

for the exposed

Odds of exposure

for the controls

Odds of exposure

for the cases.

Odds of disease

for

(59)

The rare di sease assumpti on

(60)

Exampl e

OR = 90*70/10*30 =21.0

Note: This indicates that those with Hep C infection have a 21-fold

increase in their

odds

of developing liver cancer (not in their

risk!).

The odds ratio will always be bigger than the corresponding risk

ratio if RR >1 and smaller if RR <1 (the harmful or protective

effect always appears larger)

The magnitude of the inflation depends on the prevalence of the

disease.

Hep C +

Hep C

-Cases: Liver

cancer

(61)

The odds rati o vs. the ri sk rati o

1.0 (null)

Odds ratio

Risk ratio

Risk ratio

Odds ratio

Odds ratio

Risk ratio

Risk ratio

Odds ratio

Rare Outcome

Common Outcome

(62)

In-Cl ass Exerci se:

1. Suppose the following data were collected on

a random sample of subjects (the researchers

did not sample on exposure or disease status).

Calculate the odds ratio and risk ratio for the association

between cell phone usage and neck pain.

Neck pain

No Neck Pain

Own a cell phone

143

209

Don’t own a cell

phone

22

69

(63)

Answer

OR = (69*143)/(22*209) = 2.15

RR = (143/352)/(22/91) = 1.68

Neck pain

No Neck

Pain

Own a cell

phone

143

209

Don’t own a

cell phone

(64)

In-Cl ass Exerci se:

2. Suppose the following data were

collected on a random sample of subjects

(the researchers did not sample on

exposure or disease status).

Calculate the odds ratio and risk ratio for the association

between cell phone usage and brain tumor.

Brain tumor

No brain tumor

Own a cell phone

5

347

Don’t own a cell

phone

3

88

(65)

Answer

OR = (5*88)/(3*347) = .42267

RR = (5/352)/(3/91) = .43087

Brain tumor

No brain tumor

Own a cell phone

5

347

Don’t own a cell

phone

(66)

Counti ng Rul es

Rules for counting the number of

possible outcomes

Counting Rule 1

:

If any one of k different mutually exclusive and

collectively exhaustive events can occur on each

of n trials, the number of possible outcomes is

equal to

(67)

Counti ng Rul es

Counting Rule 2

:

If there are k

1

events on the first trial, k

2

events

on the second trial, … and k

n

events on the n

th

trial, the number of possible outcomes is

Example:

You want to go to a park, eat at a restaurant, and see a

movie. There are 3 parks, 4 restaurants, and 6 movie

choices. How many different possible combinations are

there?

Answer: (3)(4)(6) = 72 different possibilities

(k

1

)(k

2

)…(k

n

)

(68)

Counti ng Rul es

Counting Rule 3

:

The number of ways that n items can be arranged

in order is

Example:

Your restaurant has five menu choices for lunch. How many

ways can you order them on your menu?

Answer: 5! = (5)(4)(3)(2)(1) = 120 different possibilities

n! = (n)(n – 1)…(1)

(69)

Counti ng Rul es

Counting Rule 4

:

Permutations

: The number of ways of arranging

X objects selected from n objects in order is

Example:

Your restaurant has five menu choices, and three are

selected for daily specials. How many different ways can

the specials menu be ordered?

Answer:

different possibilities

(70)

Counti ng Rul es

Counting Rule 5

:

Combinations

: The number of ways of selecting

X objects from n objects, irrespective of order, is

Example:

Your restaurant has five menu choices, and three are

selected for daily specials. How many different special

combinations are there, ignoring the order in which they are

selected?

Answer:

different possibilities

(71)

Chapter Summary

Discussed basic probability concepts

Sample spaces and events,

contingency tables, simple

probability, and joint probability

Examined basic probability rules

General addition rule, addition rule

(72)

Chapter Summary

Defined conditional probability

Statistical independence,

marginal probability, decision

trees, and the multiplication

rule

Discussed Bayes’ theorem

Referensi

Dokumen terkait

Keuntungan (kerugian) aktuarial program imbalan pasti g. Pajak penghasilan terkait dengan penghasilan komprehensif lain h. Selisih kuasi reorganisasi 4).. 21. Selisih

Program Frontpage XP merupakan Editor dari HTML (Hyper Text Markup Language) yang juga dasar dari pembuatan Website ini, didalam Html terdapat berbagai macam tag-tag dan atribut

Alienation in Haruki Murakami’s Blind Willow, Sleeping Woman.. Universitas Pendidikan Indonesia | repository.upi.edu | perpustakaan.upi.edu

- Otonomi Daerah, Pemerintahan Umum dan Administrasi Keuangan Urusan Pemerintahan. SATUAN KERJA

[r]

Skripsi ini berjudul “ANALISIS HUKUM TERHADAP PERUSAKAN TERUMBU KARANG DI TINJAU DARI PASAL 73 UNDANG-UNDANG NOMOR 1 TAHUN 2014 TENTANG PENGELOLAAN WILAYAH

Mutchler (dalam Santosa dan Wedari, 2007), menyatakan bahwa auditor lebih sering mengeluarkan modifikasi opini audit going concern pada perusahaan yang lebih kecil,

Hasil dari studi perbandingan tersebut diperoleh gambaran mengenai konsep kubur di Situs Bumi Rongsok, yaitu khusus konsep kubur batu berundak-undak di situs ini belum