Probability Distribution of Variables - Economic Risk Assessment .1 Probability Theory

3.3 Economic Risk Assessment .1 Probability Theory

3.3.2 Probability Distribution of Variables

Most civil engineering problems deal with quantitative measures using the familiar deterministic formulations of engineering problems. However, there is nothing pre-determined about the real-life case, for example, when you assume you have a reinforced concrete column in drawings mentioning that its section is 500 mm x 500 mm.

This means it will be around this exact number when you measure the column. It is possible that there could be some deviation, which is allowable in the code.

The theory of random variables has enables the analyst to furnish useful substitutes for less precise qualitative characteristics.

Variables, whose specific values cannot be predicted with certainty before an experiment, can be presented by the probabilistic models and distributions.

3.3.2.1 Normal Distribution

Making only the assumption that any value within some range of values is possible, the normal distribution can be and is used to represent many phenomena. It is used in decision-making to model expectations about the inflation rate, or the future price of oil. This distribution is widely used in metering equipment, in which it can represent the distribution of measurement errors; in reservoir studies it can be used to predict soil permeability, the spaces between the grains and saturation as well as some economic data.

Equation:

Mean: ^

x = ^ - (3.13) n

where:

x- arithmetic mean of sample data

%-t - each individual value in sample n = number of values in sample cm = class mark

nc = number of values in class

6-0.5ί.χ-μ)2/σ2 (3.12)

Standard deviation is given by

171 Λ f5

**°*=\**

^λ

]Σχ

-»

^{2 (314)}

where σ is the standard deviation, and μ is the arithmetic mean.

The Normal distribution is the most commonly used probability distribution, because it can generate information about a set of measurements without our having to know anything about how the phenomena of interest came to exist in the first place, or whether some values are more likely than others: its sole assumption is that any possible value within some range may be assigned some non- zero probability. As distinct from many of the other distributions discussed below, a normally-distributed variable is always indiffer- ent to the passage of time. Analysis on this basis of measurements of the output of one and the same process are ideal candidates for the application of this tool. Thus, for example, it was found that the Normal distribution is the best probability curve to present concrete strength from laboratory tests performed on the concrete in most countries of the world. (The moment we have reason to know that the assumption of all outcomes being possible is inapplicable, other distributions should be considered, as will be seen below.)

The most significant characteristics of the Normal distribution for present purposes are:

• distribution is symmetric around the average; more precisely: the arithmetic mean of the curve is divided into two equal halves

• a Normal distribution matches the arithmetic mean and median lines and mode value that is most likely to occur

• area under the curve equals 1 and that the random variable in the outcomes of the cubes to, that this curve represents all the °° to -»break it to take values from potential possibilities of the concrete's strength.

As a result, each curve depends on the value of the arithmetic mean and standard deviation, and any difference between the two parameters leads to a difference in the shape of the probability distribution. Therefore, the standard normal distribution is used to determine areas under a curve by knowing the standard deviation

and arithmetic mean using another variable, Z, which is obtained from the following equation:

z = ^ ^ (3.15)

σ

Table (3.11) shows the values of the area under the curve by knowing the value of z from the above equation. In the first column, the value of z and first row determine the accuracy to the nearest two decimal digits. From the table one can find that the area under the curve at z is equal to 1.64. The area under the curve for any value less than z is 0.5-0.4495, which is equal to about 0.0505.

In other words, the probability of the variables has a value less than or equal to 5% as shown in Figure (3.7).

Z=1.64

Figure 3.7 Normal distribution curve.

Figure 3.8 The division of areas under the normal distribution.

Table 3.11 Z 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 1.4 1.5 1.6 1.7

The area under the curve of normal distribution 0 0 0.0398 0.0793 0.1179 0.1554 0.1915 0.2257 0.258 0.2881 0.3159 0.3413 0.3643 0.3849 0.4032 0.4192 0.4332 0.4452 0.4554

0.01 0.004 0.0438 0.0832 0.1217 0.1591 0.195 0.2291 0.2611 0.291 0.3186 0.3438 0.3665 0.3869 0.4049 0.4207 0.4345 0.4463 0.4564

0.02 0.008 0.0478 0.0871 0.1255 0.1628 0.1985 0.2324 0.2642 0.2939 0.3212 0.3461 0.3686 0.3888 0.4066 0.4222 0.4357 0.4474 0.4573

0.03 0.012 0.0517 0.091 0.1293 0.1664 0.2019 0.2357 0.2673 0.2967 0.3238 0.3485 0.3708 0.3907 0.4082 0.4236 0.437 0.4484 0.4582

0.04 0.016 0.0557 0.0948 0.1331 0.17 0.2054 0.2389 0.2704 0.2995 0.3264 0.3508 0.3729 0.3925 0.4099 0.4251 0.4382 0.4495 0.4591

0.05 0.0199 0.0596 0.0987 0.1368 0.1736 0.2088 0.2422 0.2734 0.3023 0.3289 0.3531 0.3749 0.3944 0.4115 0.4265 0.4394 0.4505 0.4599

0.06 0.0239 0.0636 0.1026 0.1406 0.1772 0.2123 0.2454 0.2764 0.3051 0.3315 0.3554 0.377 0.3962 0.4131 0.4279 0.4406 0.4515 0.4608

0.07 0.0279 0.0675 0.1064 0.1443 0.1808 0.2157 0.2486 0.2794 0.3078 0.334 0.3577 0.379 0.398 0.4147 0.4292 0.4418 0.4525 0.4616

0.08 0.0319 0.0714 0.1103 0.148 0.1844 0.219 0.2517 0.2823 0.3106 0.3365 0.3599 0.381 0.3997 0.4162 0.4306 0.4429 0.4535 0.4625

0.09 0.0359 0.0753 0.1141 0.1517 0.1879 0.2224 0.2549 0.2852 0.3133 0.3389 0.3621 0.383 0.4015 0.4177 0.4319 0.4441 0.4545 0.4633

o NO tv

O N ON NO

CO ON NO ·<*

N O 00 NO

·<*

tv NO

NO NO

NO LO N O

ON NO

1 ' NO

00 IN N O tv

■ < *

N O tv

NO in IN.

LO tv ·<*

00 CO tv

CN CO tv

NO CN tv

O N r-<

tv >*

Ϊ-Η tv

^* d

r—I

tv 00

CN 1 1 00

00 o

CO o

00 ON tv

■ > *

CO ON tv

00 00 tv

CO 00 tv

■ < *

00 tv tv

·>*

CN tv LO 00

■ > *

LO 00

IT) 00

NO 00

CN 00

00 CO 00

CO 00

NO CN 00

CN T—'

CN ON 00

tv 00 00

00 00

00 tv 00

in tv 00

tv 00

00 NO 00

NO 00

CN CN

NO l-H ON

CO ON

ON o

NO o

d o

O ON

00 ON 00

NO ON 00

CO ON 00

■ ■ *

CO CN

NO CO ON

CO ON

CN CO ON

r—<

CO ON

ON CN ON

tv CN ON

in CN ON

CN CN ON

CN ON

i — i

CN CN in ON

d m

■ < *

00 ON

NO ON

d m

■ < *

1 — I

00 CO ON

CN NO ON

CO NO ON

■ > *

CN NO ON

■ > *

NO ON

ON in ON

tv in ON

NO in ON

in in ON

CO m

NO CN

TP tv ON

CO tv ON

CN tv ON

i — I

tv ON

ON NO ON "^

00 NO ON

tv NO ON

NO NO ON

in NO ON

tv CN

00 ON

00 ON _{■ ■ *}

ON tv ON

tv tv ON

NO tv ON

in tv ON

·>*

tv ON

00 CN

NO 00 ON

in 00 ON

00 ON

■ > *

00 ON

·>*

CO 00 ON

CN 00 ON

00 ON

ON CN

ON ON

ON ON _{■ < *}

ON 00 ON ·>*

ON 00 ON

00 00 ON

IV 00 ON

tv 00 ON

CO CO ON ON

CN ON ON

■ > *

CN ON ON "^

ON ON

in ON ON

d m

ON ON

CO ON ON

CN CO

tv ON ON

NO ON ON

d m

ON ON

in ON ON

d m

ON ON

CO CO

ON 00 ON

tv ON ON

IV ON ON

tv ON ON

IV ON ON

00 ON ON

■«*<

00 ON ON

CO tv ON ON ON

ON ON ON ON

Figure (3.8) shows the area under the curve when you add or decrease the value of the standard deviation of the arithmetic mean.

The area under the curve from the arithmetic mean value to one value of the standard deviation is equal to 34.13%, while in the area under the curve in Figure (3.8) for twice the standard deviation values is equal to 47.72%.

3.3.2.2 Log Normal Distribution

This distribution is used when a phenomenon does not take a negative value. Therefore, it is used in representing the size of the aquifer or reservoir properties such as permeability of the soil. It also represents real estate prices and others.

Equation:

x-1e-(mx-J/2f?) (³¹⁶)

/ ( * ) = 7icR

Figure 3.9 Lognormal distribution.

Mean:

_ Xdogx)

l n x^G= — = ^ (3.17)

Standard Deviation:

l n q = [ ( - ) Σ ( 1 η χ )²- ( 1 η ^ )²]^{0 5} (318)

3.3.2.3 Binominal Distribution

This distribution is used in the following situations:

• determination of geological hazards;

• calculation of the performance of the machine for the cost and the cost of spare parts;

• determination of the appropriate number of pumps with the appropriate pipeline size with the required fluid capacity and the number of additional machines;

• determination of the number of generators according to the requirement of the project and to determine the number of additional generators in the case of an emergency or malfunction in any machine.

To understand the nature of this distribution let us use the following:

Equation:

/(*)= / · .,(/Γ(ΐ-/Ρ (3.19)

Mean:

x = n.f (3.20) Standard Deviation:

σ, = [ « . / . ( ! - / ) ]⁰-⁵ (3.21)

Example 1:

When playing by the coin, the probability of the queen appearing is P=.50. What is the probability that we get the queen twice when we lay down the currency 8 times?

F(x) = [8!/2!(6!)](0.5)²(0.5)⁸-²

= 0.189

This means that when you take a coin 6 times, the probability that the image will appear twice is 0.189.

Example 2:

Assuming the probability of 0.7 when drilling a single well that has oil, what is the probability that we find oil in 25 wells when we drill 30 wells?

3 0 . ,„ -\25/i n rr\30-25

25K30-25)! (0.7Γ ( 1 - 0 . 7 ) ^ = 0 . 0 4 6 4

Therefore, we find that the likelihood of success of the individual well is 0.7, but the possibility that the 25 successful wells were drilled is 0.0464.

Example 3:

Assess the reliability of a system requiring 10,000 KW to meet system demand. Each generator has been rated 95% reliable (5% failure rate).

The company is comparing 3 alternatives: 2-5000 KW generators, 3-5000 KW, and 3-4000 KW generators.

When we do a comparison between normal and logarithmic distribution and the binominal distribution and look at the shape of each of the three curves, we find that the log curve and normal distribution curve are solid curves which are different from the Binominal distribution curve as in figure (3.10). Therefore,

Table 3.12 Alternative for example 3 2-5000

10,000 5,000 0 Total

Avg. Reliability

0.9025 0.0950 0.0025 1.000 0.9500

3-5000 0.9928 0.0071 0.0001 1.000 0.9963

3-4000 12,000 8,000 4,000 0

0.8574 0.1354 0.0071 0.0001 1.000 0.9685

Figure 3.10 Binominal distribution.

the normal distribution is called the Probability Density Function (PDF). These PDF distribution curves are used in cases where descriptions of the natural phenomena or materials properties that can take any values, for example, when you calculate the heights of

people in the building that you are in. You will find that the lowest number, for example, is 120.5 cm, and the highest number is 180.4 cm, and the heights of people can be any number between those numbers. But in the case of the last example, the number of drilling wells is between 1 and 25 wells. If we calculate the probability of success for 20 wells, we cannot say that there is a possibility of drilling 20.5, for example, because you cannot drill half a well.

Therefore, in this case the probability distribution will be called the Probability Mass Function (PMF). This is very important when choosing the suitable distribution, which should match the natural phenomena for these variables. When defining the probability distributions for steel strength, oil price, or population, one should use the probability density function (PDF).

3.3.2.4 Poisson Distribution

This distribution is based on the number of times the event occurs within a specific time period, such as the number of times the phone rings per minute or the number of errors per page of a document

Figure 3.11 Poisson distribution.

overall, and that description is used in transport studies or in decid- ing upon the number of fuel stations to fuel cars, as well as in the design study for telephone lines.

Mean:

m,=A (3-22) Standard Deviation:

σ = λ (3.23)

It will be a probabilistic mass function, as shown in Figure (3.11) 5-Exponential Distribution

This distribution represents the time period between the occurrences of unanticipated events. For example, the time period between the occurrences of electronic failures in equipment reflects this distribution and is the opposite of Poisson distribution. It can be used to describe time periods to be expected between machine failures:

there are now extensive studies that use this model to determine the appropriate time period for maintenance of equipment, called mean time between failure (MTBF).

1 0.75

S 0.5

Li.

0.25

0 1 2 3 4 5 X

Figure 3.12 The exponential distribution.

Probability Density Function:

fT(t) = Äe-M

(3.24)

Mean:

M⁽= l / A (3-25)

Standard Deviation:

σ = 1/λ (3-26)

3.3.2.5 Weibull Distribution (Rayleigh Distribution)

Wind speed is one of the phenomena for which the Weibull distribution is also used, for example, when planning, as part of machine parts production, quality control criteria for metals stress created by wind exposure. This distribution is complicated and,

Figure 3.13 Weibull distribution.

therefore, is not recommended for use in the case of building a huge model of an entire problem when using the Monte-Carlo simulation.

3.3.2.6 Gamma Distribution

This distribution represents a large number of events and transactions such as inventory control or representation of economic theories, and the theory of risk insurance is also used in environmental studies, when there is a concentration of pollution. It is also used in studies where there is petroleum crude oil and gas condensate, and it can be used in the form of treatment in the case of oil in an aquifer.

Equation:

/ ( * ) =¹ χα-νχ/β (3.27)

M ; βαΤ(α)

Figure 3.14 Gamma distribution.

Mean:

Standard Equation:

x = aß (3.28)

σ$=^ψ> (3.29)

The different shapes of gamma distribution are presented in Figure (3.14).

3.3.2.7 Logistic Distribution

This distribution is used frequently to describe the population growth rate in a given period of time. It may also be used to represent interactions between chemicals.

f(x-us)= g "^{) / S} (3.30)

Figure 3.15 Logistic distribution.

3.3.2.8 Extreme Value (Gumbel Distribution)

This distribution is used when the intended expression of the maxi- mum value of the event occurs in a period of time. Therefore, it is

used for floods, earthquakes, or rain and is used to calculate the loads on the plane and study the fracture resistance of some materials.

f

(z) = aexp[a{z-u)-e

^a{z

-

^u)

], (3.31) F

(z) = l-exp(-e

^e(z

-

^B>

), (3.32)

where -°° £ z <; °°.

L . (3.33)

m =u-—.

σ = - ξ = . (3.34)

aVó

3.3.2.9 Pareto Distribution

This distribution is commonly used in the analysis of various aspects of per capita income, changes in stock prices, variations in the population in a city, the patterns of staffing numbers in a company, even error rates in data communications circuits. It also represents cer- tain kinds of changes in the distribution of natural resources.

1 0.9 0.8 0.7 g 0.6

£ 0.5 1_g 0.4 0.3 0.2 0.1

0 0 1 2 3 4 5 6 7 8 9 10

x / sigma

Figure 3.16 Pareto distribution.

It uses the Pareto principle, based on the idea that by doing

20% of work, 80% of the advantage of doing the entire job can be

generated. Or in terms of quality improvement, a large majority of

problems (80%) are produced by a few key causes (20%).

Dalam dokumen Construction Management for Industrial Projects (Halaman 77-92)