Probability and Statistics

(1)

Probability and Statistics

몬테카를로 방사선해석 (Monte Carlo Radiation Analysis)

Notice: This document is prepared and distributed for educational purposes only.

(2)

Properties of Probability

• To an event E_k, we assign a probability p_k, also denoted by P(Ek), or “probability of the event E

k”. The quantity p

k must satisfy the properties given in Fig. 1:

• 0 ≤ ≤ 1

• If is certain to occur, = 1.

• If is certain not to occur, = 0.

• If events and are mutually exclusive, then P( and ) = 0; and P( or ) = +

• If events , ( =1, 2, …, N) are mutually exclusive and

exhaustive (one of the N events is assured to occur), then

∑ = 1

(3)

Element of Probability Theory

 A random variable = a symbol that represents a

quantity that can not be specified without the use of probability laws.

- denoted by a capital letter X with the lower case x for a fixed value of the random variable.

- a real value x_i that is assigned to an event E_i^. - has a discrete distribution if X can take only a

countable, finite or infinite, number of distinct values:

[x_i, i=1, 2, . . ].

∑ = 1.

(4)

Expectation Value: Discrete Random Variable

• An expectation value for the random variable X is defined by

• Define a unique, real-valued function of x, g(x), which is also a random variable. The expected value of g(x) is

• For a linear combination of random variables,

(1)

(2)

(3)

(5)

Variance: Discrete Random Variable

• A variance for the random variable X is defined by

• The standard deviation of the random variable X is defined by

• It is straightforward to show the following^:

(4)

(5)

(6)

(7)

Linear/Non-linear Statistic

• The mean of linear combination of random variables equals the linear combination of the means because the mean is a linear statistic.

• The variance is not a linear statistic.

(3)

(7)

(8)

(3)

(9)

(7)

(10)

Product of Random Variables

• The average value of the product of two random variables is

• If x and y are independent,

p_ij = f_ig_j (17)

• Then

• Eq. (7) is reduced to Eq. (11) if g(x) and h(x) are independent.

(8)

(9)

(10)

(11)

Covariance and Correlation Coefficient

• The covariance is defined by

- If x and y are independent, cov(x,y) = 0.

- It is possible to have cov(x,y) = 0 even if x and y are not independent.

- The covariance can be negative.

• Correlation coefficient is a convenient measure of the degree to which two random variables are correlated (or anti-correlated).

where -1 ≤ ρ(x,y) ≤ 1.

sufficient cond. necessary cond.

(12)

(13)

(12)

ρ = correlation coefficient

ρ = 1 when y = ax + b ρ = -1 when y = -ax + b

(a>0)

(13)

 A random variable has a continuous distribution if X can take any value between the limits x₁ and x₂ : [x₁ ≤ X ≤ x₂].

Prob (x₁ ≤ X ≤ x₂) =

where p(x) is the probability density of X = x and

= 1.

• The probability of getting exactly a specific value is zero because there are an infinite number of possible values to choose from.

p_i ≡ p(x_i) ~ 0

Continuous Random Variable

(14)

Expected Value and Variance for Continuous pdf’s

• For a continuous pdf, f(x),

• For a real-valued function g(x),

•

(15)

Probability Density Function

• Let f(x) be a probability density function (pdf).

• f(x)dx = the probability that the random variable X is in the interval (x, x+dx):

f(x)dx = P(x≤X≤x+dx) .

• Since f(x)dx is unitless, f(x) has a unit of inverse random variable.

(16)

Probability Density Function (cont.)

• The probability of finding the random variable somewhere in the finite interval [a, b] is then

which is the area under the curve f(x) from x=a and x=b.

• Since f(x) is a probability density, it must be positive for all values of the random variable x, or

f(x) ≥ 0, -∞ < x < ∞

• Furthermore, the probability of finding the random variable somewhere on the real axis must be unity.

(17)

Cumulative Distribution Function

• The cumulative distribution function (cdf) F(x) gives the probability that the random variable is less than or equal to x:

• Since F(x) is the indefinite integral of f(x), f(x) = F’(x).

- definite integral of f(x) in [a, b]

vs.

(18)

Cumulative Distribution Function (cont.)

• Since f(x) ≥ 0 and the integral of f(x) is unity, F(x) obeys the following conditions:

- F(x) is monotonously increasing.

- F(-∞) = 0 - F(+∞) = 1

(19)

(20)

monotonically increasing (left) and decreasing (right)

Not monotonic

 A strictly monotonic function F(x):

with f(x) ≠ 0.

 keeps increasing or remains constant

 keeps decreasing or remains constant

(21)

(22)

Expected Values in MC Simulation

• In many cases, the true mean is not known and the purpose of the Monte Carlo simulation is to estimate the true mean.

• The estimates of MC simulation are denoted by

hoping that

(23)

Relationship of Discrete and Continuous pdf’s

• For a discrete pdf of random variable x,

• Take the limit N→∞, then

(24)

Composite Event, Joint Probability

• Consider an experiment that consists of two parts and each part leads to the occurrence of specified events.

• Define events arising from the first part of the experiment by F_i with the probability f_i and events arising from the

second part by G_j with the probability g_j.

• A composite event = the combination of events F_i and G_i, denoted by the ordered pair E_ij = (F_i, G_j).

• joint probability p_ij = P(E_ij)

≡ the probability that the first part of the experiment led to event F_i and the second part led to the event G_j.

= the probability that the composite event E_ij occurred.

(25)

Joint Probability

• The joint probability p

ij can be factored into the product of a marginal probability and a conditional probability.

p_ij = p(F_i)∙p(G_j/F_i) (1) where p(F

i) = the marginal probability, or the probability that event F_i occurs regardless of event G_j; and p(G_j/F_i) = the conditional probability, or the probability

that event G_j occurs given that the event F_i occurs.

• The marginal probability for event F_i to occur is simply the probability that the event F_i occurs, or p(F_i) = f_i.

(26)

Margin

Marginal Probability

?

How to get the conditional probability from the matrix data?

(27)

Conditional Probability

• Assume that there are J mutually-exclusive and

exhaustive events G_j, j=1,2,…,J and the following identity is evident.

• Using Eq. (2), we can obtain Eq. (3).

= p(F_i)

(28)

p ij

= p(F i

)∙p(G j

/F i

) (1)

(29)

Conditional Probability (cont.)

• By comparing Eq.’s (1) and (3), we obtain the expression for the conditional probability p(G

j/F

i) :

• All joint, marginal and conditional probabilities satisfy the properties in Fig. 1.

• If event F_i and G_j are independent, that is, the probability of one occurring does not affect the probability of the

other occurring, then

p_ij ≡ P(F_i) x P(G_j) = f_i x g_j (5), and thus p(Gj/F

i) = g

j (6) from Eq. (4).

(30)

Bivariate Probability Distribution

(continuous variables)

• The probability that the first random variable falls within (x, x+dx) and the second falls within (y, y+dy) defines the bivariate pdf, f(x,y)

• Marginal probability density function

m(x)dx = probability that x’ is in dx about x, irrespective of y’.

(31)

Bivariate Probability Distribution (cont.)

• Conditional probability density function

c(yIx)dy = probability that y’ is in dy about y, assuming x’=x.

• The constraint m(x)≠0 simply means that x’ and y’ are not mutually exclusive, meaning there is some probability that both x’ and y’ occur together.

• If x and y are independent random variables, then m(x) = f(x); and c(yIx) = g(y).

Consequently, f(x,y) = f(x)g(y)

(32)

Bivariate Cumulative Distribution Function

• Cumulative distribution function (cdf) is defined as

• The probability that the random doublet (x’,y’) falls within a finite region of the xy-plane is