Modeling Uncertainties
Probability
• Ketidakpastian direpresentasikan dengan menggunakan probability
• Probability:
• Frequentist view
• Long-run frequency of events
• Subjective view
• representations of our
subjective degree of belief
Events
• Event is a distinction about some states of the world
• Example:
• Whether the next person entering the room is a heavy smoker
• Whether it will be raining tonight
• Our next head of department
• Etc
Clarity test
• When we identify an event, we have in mind what we meant. But will other people know precisely what you mean?
• Even you may not have precise definition of what you have in mind
• To avoid ambiguity, every event should pass the clarity test
• Clarity test: to ensure that we are absolutely clearand precise about the definition of every event we are dealing with in a decision problem
• The clarity test is conducted by submitting our definition of each event to a clairvoyant
• A clairvoyant is a hypothetical being who is:
• Competent and trustworthy
• Knows the outcome of any past and future event
• Knows the value of any physically defined quantity both in the past and future
• Has infinite computational (mental) power and is able to perform any reasoning and computation instantly and without any effort
Clarity test (cont’d)
•
Passing the clarity test:
•
If and only if the clairvoyant can tell its outcome without any further judgment
•
Example:
•
The next person entering this room is a heavy smoker
•
What is a heavy smoker?
•
The next person entering this room is a graduate
•
What is a graduate?
Possibility tree
• Single event tree
• Example: event “the next person entering this room is a businessman”
• Suppose B represents a businessman and B’
otherwise,
Possibility tree
• Two-event trees
• Simultaneously consider several events
• Example: event “the next person entering this room is a businessman” and event “the next person entering this room is a
graduate” can be jointly considered
Possibility tree
• Uncertain variable dilambangkan dengan bentuk bulat yang selajutnya disebut chance node
• Decision maker tidak mempunyai kontrol terhadap outcome variable ini
• Outcome dari uncertain variable ini dilambangkan dengan “cabang”
• Jumlah cabang sesuai dengan jumlah outcome yang mungkin
• Jumlah node menunjukkan jumlah variable yang terlibat
Possibility tree
• Dua uncertain variables
• Ada atau tidaknya
hubungan antara kedua variable tersebut hanya bisa dilihat dari nilai
conditional probability- nya
Prior/marginal probability
Conditional probability
Joint probability
P(I sukses)
P(I gagal)
P(II sukses|I sukses)
P(II gagal|I sukses)
P(II sukses|I gagal)
P(II gagal|I gagal)
P(II sukses dan I sukses)
P(II gagal dan I sukses)
P(II sukses dan I gagal)
P(II gagal dan I gagal)
Perhatikan kasus berikut…
• Terdapat 10 bola dalam sebuah wadah, 5 bola berwarna merah dan 5 bola berwarna biru.
• Dilakukan pengambilan sebuah bola sebanyak dua kali dengan scenario sebagai berikut:
• Skenario 1: setelah pengambilan bola pertama, bola dikembalikan kembali ke dalam wadah
• Skenario 2: setelah pengambilan bola pertama, bola tidak dikembalikan ke dalam wadah
• Tentukan berapakah kemungkinan mendapatkan bola merah di pengambilan pertama dan bola biru di pengambilan kedua untuk maing-masing skenario?
Probabilistic dependency or relevance
• Misalkan
• A variable dengan n possible outcomes ai, i=1,…,n
• B variable dengan m possible outcomes bj ,j=1,…,m
• Variabel A dikatakan probabilistically dependent terhadap variable B jika p(A|bj, ξ) ≠ p(A|bk, ξ) untuk j ≠ k
• Variabel A dikatakan probabilistically independent terhadap variable B jika p(A|bj, ξ) = p(A|bk, ξ) for all j = k
• Sehingga, jika A independent terhadap B, maka p(A|B, ξ) = p(A|ξ)
• Dengan demikian, jika variable saling independent, maka mengetahui outcome dari salah satu variable tidak akan memberikan informasi apapun terkait outcome dari variable lainnya.
Marginal and conditional probabilities
• In general, given information about the outcome of some events, we may revise our probabilities of other events
• We do this through the use of conditional probabilities
• The probability of an event X given specific outcomes of another event Y is called the conditional probability X given Y
• The conditional probability of event X given event Y and other background information ξ, is denoted by p(X|Y, ξ) and is given by
0 )
| ( ) for
| (
)
| ) (
,
|
( =
p YY p
Y X
Y p X p
Factorization rule for joint probability
Bayes’ Theorem
• Given two uncertain events X and Y. Suppose the probabilities p(X|ξ) and p(Y|X, ξ) are known, then
=
=
X
X Y
p X
p Y
p
where
Y p
X Y
p X
Y p X p
)
|
| ( )
| ( )
| (
)
| (
) ,
| ( )
| ) (
,
| (
Bagaimana jika urutan berubah?
• Flipping the tree
• Joint probability tetap
• Hitung marginal probability di tree yang baru
• Hitung conditional probability
• Gunakan Bayes theorem
Contoh
• Misalkan baterai laptop merek X dipasok dari 3 supplier, A, B, dan C dengan persentase berturut-turut adalah 50%, 30%, dan 20%.
• Performance dari 3 supplier tersebut berdasarkan survey lapangan diketahui sebagai berikut:
• Supplier A: 0,5% dari total produknya yang ada di pasaran defect
• Supplier B: 1 % dari total produknya yang ada di pasaran defect
• Supplier C: 0,8% dari total produknya yang ada di pasaran defect
• Misalkan di lapangan ditemukan adanya baterai gagal, berapa probabilitas bahwa baterai tersebut berasal dari Supplier A?
Conditional independence or relevance
• Suppose given 2 events, A and B, and they are found to be not independent
• Introduce event C with 2 outcomes, c1 and c2
• If C=c1 is true, and we have p(A|B, c1, ξ)=p(A|c1, ξ)
• If C=c2 is true, we have
p(A|B, c2, ξ)=p(A|c2, ξ)
• Then we say that event A is conditionally independent of event B given event C
• Definition (Conditional Independence):
given 3 distinct events A, B, and C, if p(A|B, ck, ξ)=p(A|ck, ξ) for all k, that is the conditional
probability table (CPT) for A given B and C repeats for all possible realizations of C, then we say that A and B are conditional independent given C, and denote by A ⊥B|C
Conditional independence (cont’d)
• If then p(A|B, C, ξ)=p(A|C, ξ)
• Example:
Given the following conditional probabilities:
p(a1|b1, c1)= 0.9 p(a2|b1, c1)= 0.1 p(a1|b2, c1)= 0.9 p(a2|b2, c1)= 0.1 p(a1|b1, c2)= 0.8 p(a2|b1, c2)= 0.2 p(a1|b2, c2)= 0.8 p(a2|b2, c2)= 0.2
we conclude that
• Note that A is not (marginally) independent of B unless we can show that p(a1|b1) = p(a1|b2) with more information
C B A⊥ |
C
B
A ⊥ |
Join probability distribution of
conditional probability distribution
• Recall, by factorization rule, the joint probability for A, B, and C is p(A, B, C|ξ)= p(A| B, C, ξ)p(B|C, ξ)p(C| ξ)
• If A is independent of B given C, then since p(A| B, C, ξ) = p(A|C, ξ) we have
p(A, B, C|ξ)= p(A| C, ξ)p(B|C, ξ)p(C| ξ)
Application of conditional probability
Direct conditioning: Relevance of smoking to lung cancer
• Suppose:
S: A person is a heavy smoker which is defined as having smoked at least two packs of cigarettes per day for a period of at least 10 years during a lifetime L: A person has lung cancer according to standard medical definition
• A doctor not associated with lung cancer treatment assigned the following probabilities:
Relevance of smoking to lung cancer (cont’d)
• A lung cancer specialist remarked: “The probability p(L1|S1, ξ) = 0.1 is too low”
• When asked to explain why, he said:
“Because in all these years as a lung cancer specialist, whenever I visited my lung cancer ward, it is always full of smokers.”
• What’s wrong with the above statement?
• The answer can be found by flipping the tree:
Relevance of smoking to lung cancer (cont’d)
• What the specialist referred to as “high” is actually the probability of a
person being a smoker given that he has lung cancer, i.e., p(S
1|L
1, ξ) = 0.769 is exactly what he was referring to.
• He has confused p(S
1|L
1, ξ) with p(L
1|S
1, ξ)
• Notice that p(L
1|S
1, ξ) << p(S
1|L
1, ξ)
• Hence even highly a trained professional can fall victim to wrong reasoning
Let’s Make a Deal Game Show
Rules:
• Consider the TV game show where the contestant is shown o stage three boxes, one of which contains a valuable prize; the other two are empty
• The rules of the game are that the contestant first chooses one of the boxes. Then, the game show host who knows the location of the prize opens one of the remaining two boxes, making sure to open an empty one.
• the contestant then gets to decide if he wants to stick with his initial selection or switch to the remaining unopened box.
• If the prize is in the box that he finally chooses, he wins the prize Question:
If at the start of the game the contestant chose box A and the host open box B, should the contestant keep choosing box A or swicth to box C?
Updating probabilities based on new evidence or information
Recall:
• p(A|ξ) is probability of event A based on our subjective assessment of the likelihood of event using any information we have → prior probability
• If new information E has arrived, then the probability of A is updated using Bayes’ Theorem:
where p(E|A) is called the likelihood function for the evidence E and )
(
)
| ( ) ) (
|
( p E
A E p A E p
A
p =
=
j
j
j p E A
A p E
p( ) ( ) ( | )
Example: weather forecast
• Suppose, the prior probability that it will rain tonight (R1) is 0.6 and it will not rain (R2) with probability 0.4
• Suppose we are using information from the weather forecast whose performance is as follows
• If the weather station announces that it will rain tonight, what probability should you assign to the outcome it will indeed rain tonight?
Weather forecast (cont’d)
Try to use Bayes’ theorem!
Another example
• In the city, there are only two taxicab companies, the Blue and the Green. The Blue company operates 90% of all cabs in the city and the Green company operates the rest. One dark evening, a pedestrian is killed by a hit-and-run taxicab.
• There is one witness to the accident. In court, the witness’ ability to distinguish cab colors in the dark is questioned so he is tested under conditions similar to those in which the accident occurred. If he is shown a green cab, he says it is green 80%
of the time and blue 20% of the time. If he is shown a blue cab, he says it is blue 80% of the time and green 20% of the time.
• The judge believes that the test accurately represents the witness’ performance at the time of the accident.
• Construct the probability tree representing the judge’s state of information!
• If the witness says “ The cab involved in the accident was green.” What probability should the judge assign to the cab involved in the accident being green?
Application of Bayes Theorem in Diagnostic and Testing
• In diagnosis, we are interested in the probability of an event that can be either “supported” or
“ruled out” based on one or more tests.
• Example: a person is suffering from disease X, the transmission system of a vehicle is faulty, etc.
• A diagnostic results can be:
Example
• In trying to determine if a person is suffering from TB, the prevailing risk of contracting TB in a certain community can be used as the pre-test probability
• Four possible situations:
Example (cont’d)
• Given the test results,
Sensitivity: true-positive rate, p(+ve|D)
Specificity: true-negative rate, p(-ve|-D)
Contoh soal 1
• Okta saat ini sedang mempersiapkan untuk menghadapi interview di sebuah perusahaan ternama. Pada awalnya, Okta menilai bahwa kemungkinan dia akan mendapatkan pekerjaan di perusahaan tersebut sebesar 50%. Okta juga mencoba mencari informasi dari teman-
temannya yang pernah melakukan interview di perusahaan tersebut dan dia menyimpulkan bahwa di antara teman-temannya yang diterima di perusahaan tersebut, 95% dari mereka merasa sukses dalam proses interview. Sedangkan di antara teman-temannya yang tidak diterima di perusahaan tersebut, ternyata 75% dari mereka juga merasa sukses dalam proses interview.
• Jika setelah proses interview berlangsung, Okta merasa bahwa interview-nya sukses, berapa kemungkinan Okta akan diterima di perusahaan tersebut?
Contoh soal 2
• Beberapa negara telah menerapkan peraturan terkait mengendarai kendaraan pada saat mengkonsumsi alkohol. Untuk menegakkan aturan ini, polisi di negara tersebut dapat melakukan “quick test” bagi para pengendara untuk mengetahui kadar alkohol di dalam darah pengemudi. Tes tersebut mempunyai akurasi 80%. Hasil quick test tersebut dikategorikan menjadi dua: high alcohol content dan low alcohol content. Selanjutnya, jika pengendara tersebut dicurigai memiliki high alcohol content, maka pengemudi tersebut akan dirujuk ke laboratorium untuk menjalani pemeriksaan level alkohol yang lebih akurat. Meskipun demikian, karena ada jeda waktu antara kedua pemeriksaan tersebut, maka terdapat error pada hasil pemeriksaan laboratorium tersebut sebesar 5%.
• Misalkan p adalah probability bahwa seorang pengendara yang diperiksa polisi memiliki high alcohol content, maka:
• Buatlah possibility tree yang menggambarkan kondisi tersebut lengkap dengan nilai probability di setiap cabangnya!
• Berapakah probability seorang pengendara yang dinyatakan lolos dari quick test namun sebenarnya memiliki high alcohol content di dalam darahnya?
Terima Kasih
Acknowledgment
Assoc. Prof. Poh Kim Leng - NUS