SEOUL NATIONAL UNIVERSITY
School of Mechanical & Aerospace Engineering
446.358
Engineering Probability
9 Properties of Expectation
Recall
E[X] = X
x
xp(x) : descrete random variable E[X] =
Z ∞
−∞
xf(x)dx : continuous random variable
•If
P{a≤X≤b} = 1 then a≤E[X]≤b Proof. (Discrete random variable)
E[X] = X
x:p(x)>0
xp(x) ≥ X
x:p(x)>0
ap(x) =a X
x:p(x)>0
p(x) =a
Similarly, E[X]≤ b Proposition
•If X & Y have a joint probability mass function p(x, y), then E[g(X, Y)] = X
y
X
x
g(x, y) p(x, y)
•If X & Y have a joint probability density function f(x, y), then E[g(X, Y)] =
Z ∞
−∞
Z ∞
−∞
g(x, y) f(x, y)dxdy
•Whenever E[X] & E[Y] are finite,
E[X + Y] = E[X] + E[Y]
•For random variables X & Y such that X≥Y,
E[X - Y] ≥0 ⇒ E[X]≥E[Y]
•If E[Xi] is finite for alli= 1,2, ...., n, then
E[X1+...+Xn] = E[X1] +...+ E[Xn].
Example .
X1, ...., Xn: i.i.d. random variables having distribution function F and expected valueµ Such a sequence of random variables is said to constitute a sample from the
distribution F.
X : = Xn
i=1
Xi
n : Sample mean Then
E[X] =E
" n X
i=1
Xi n
#
= 1 nE[P
Xi] = 1 n
XE[Xi] = µ
∴The expected value of the sample mean = the mean of the distribution.
Whenµ is unknown, the sample mean is often used in statistics to estimate it.
Example .
A1, ..., An: Some events
indicator variablesXi =
½ 1 ifAi occurs 0 else
X: = Xn
i=1
Xi : # of the events Ai that occurs
Y: =
½ 1 if X≥1 0 else
⇒ X ≥Y ⇒ E[X]
Pn|{z}
i=1E[Xi]
| {z }
Pn i=1P[Ai]
≥ E[Y]
|{z}
P{at least one of the Ai occure}
| {z }
P(Fn i=1Ai)
∴ P Ã n
G
i=1
Ai
!
≤ Xn
i=1
P(Ai) : Boole’s inequality
Example . [A random walk in the plane]
Starting from a given point in the plane, taken a sequence of steps of fixed length but in a completely random direction and uniform over (0, 2π).
The expected square of the distance from the origin aftern steps ? Solution.
Let (Xi, Yi) denote the change in position at theith step. then Xi = cosθi
Yi = sinθi
θi ,i= 1, ..., n : indept, uniform (0, 2π) random variables Aftern steps,
the position = ÃXn
i=1
Xi, Xn
i=1
Yi
!
We are interested in
D2 = ³X Xi
´2
+ ³X
Yi
´2
= X ¡
Xi2 +Yi2¢
+ ³X X
i6=jXiXj +YiYj
´
= n + X X
i6=j(cosθicosθj + sinθisinθj) θi&θj (i6=j) : independent, and
E[cosθi] = 1 2π
Z 2π
0
cosu du= 0.
E[sinθi] = 1 2π
Z 2π
0
sinu du= 0.
∴ E[D2] = n.
Example . [Analyzing the quick-sort algorithm]
We are given a set ofndistinct valuesx1, ..., xn, and we want to sort them in increasing order.
Quick-Sort Algorithm
•n=2 ⇒ Compares two values & puts them in an appropriate order.
•n >2 ⇒ One of the element is randomly chosen (say, xi), then all the other values are compared withxi.
{...}
| {z }
smallar than xi
xi {...}
| {z }
larger than xi
repetition on these brackets continuous until all the values have been sorted.
Example .
5, 9, 3, 10, 11, 14, 8, 4, 17, 6 Choose one of them at random.
Suppose that 10 is chosen {5 9 3 8 4 6}
| {z }
say6is chosen
10 {11 14 17}
| {z }
say11is chosen
{5 3 4}
| {z }
say4is chosen
6 {9 8}
| {z }
say ... chosen
1011 {14 17}
| {z }
say ... chosen
⇓ ⇓ ⇓
{3}4 {5} 6 8 9 10 11 14 17
X : # of comparisons that it takes that algorithm to sortndistinct numbers.
E[X] : a measure of the effectiveness of this algorithm.
E[X] = ?
Let 1 stands for the smallest value to be sorted Let 2 stands for the next smallest value to be sorted
For 1≤i < j ≤n, letI(i, j) =
½ 1 ifi&j are ever directly compared 0 else
Then, X =
n−1X
i=1
Xn
j=i+1
I(i, j)
E[X] = E
n−1X
i=1
Xn
j=i+1
I(i, j)
=
n−1X
i=1
Xn
j=i+1
E [I(i, j)]
= X X
P{i&j are ever compared}
| {z }
?
Initially,i, i+ 1, ..., j−1, jwill be in the same bracket, and they will remain in the same bracket if the number chosen for the first comparison is not betweeni&j.
If one ofi+ 1, ..., j−1 is chosen for the comparison, i
will go into a left bracket andjwill go into a right bracket. →i&jwill never be compared.
Ifiorj is chosen, then there will be a direct comparison between iorj.
Probability thatiorj is chosen among i, i+ 1, ..., j−1, j is j−i+12 . P{i&j are ever compared}
E[X] =
n−1X
i=1
Xn
j=i+1
2 j−i+ 1. Whenn is large,
Xn
j=i+1
2 j−i+ 1 ≈
Z n
i+1
2 x−i+ 1dx
= 2 log(x−i+ 1)|ni+1
= 2 log(n−i+ 1)−2 log 2
≈2 log(n−i+ 1)
∴ E[X]≈ 2
n−1X
i=1
log(n−i+ 1)
≈2 Z n−1
1
log(n−x+ 1)dx
= 2 Z n
2
log(y)dy
= 2(ylogy−y)|n2
≈2nlogn.
9.1 Covariance Proposition
If X & Y are independent then for any functionsh&g, E[g(X)h(Y)] = E[g(X)]E[h(Y)]
Proof.
Suppose thatX&Y are jointly continuous with f(x, y).
E[g(X)h(Y)] = Z ∞
−∞
Z ∞
−∞
g(x)h(y)f(x, y)dxdy
= Z Z
g(x)h(y)fX(x)fY(y)dxdy
= Z
....dx Z
....dy
= E[h(Y)]E[g(X)]
Definition: The Covarience between X & Y is
Cov (X, Y) = E[(X - E[X])(Y - E[Y])]
Cov (X, Y) = E[XY - E[X]Y - E[Y]X + E[X]E[Y]]
= E[XY] - E[X]E[Y]]
IfX&Y are independent then Cov(X, Y) = 0.
(Please: refereed for counter example to text page 328)
9.1.1 Properties of Covariance (i) Cov (X, Y) = Cov (Y, X)
(ii) Cov (X, X) = Var (X)
(iii) Cov (aX, Y) = a Cov (Y, X) (iv) Cov (Pn
i=1Xi,Pm
j=1Yj) =P
i
P
j Cov (Xi, Yj) Proof. of (iv)
Letµi =E[(Xi)], νi =E[(Yj)]
Then
E
"
X
i
Xi
#
= X
i
µi , E
X
j
Yj
 = X
j
νj
Cov
X
i
Xi, X
j
Yj
= E
 ÃX
i
Xi - X
i
µi
! 
X
j
Yj - X
j
νj
= E
X
i
(Xi - µi)X
j
(Yj -νj)
= E
X
i
X
j
(Xi - µi) (Yj -νj)
= X
i
X
j
E [(Xi - µi) (Yj - νj)].
•From (ii) & (iv), Var
ÃXn
i=1
Xi
!
= Cov
Xn
i=1
Xi, Xn
j=1
Xj
= Xn
i=1
Xn
j=1
Cov (Xi,Xj)
= Xn
i=1
Var (Xi) + X X
i6=jCov (Xi,Xj)
∴ Var à n
X
i=1
Xi
!
= Xn
i=1
Var (Xi) + 2X X
i<jCov (Xi,Xj)
•If Xi, ...., Xnare pairwise independent (i.e. Xi&Xj are independent fori6=j), then
Var à n
X
i=1
Xi
!
= Xn
i=1
Var (Xi) ... (∗)
Example .
X1, ..., Xn : i.i.d. random variables, expected value µ, varianceσ2. X = 1
n Xn
i=1
Xi : sample mean Xi−X (i= 1, ...., n) : deviations
S2 = Xn
i=1
(Xi−X)2
n−1 : sample variance Then,
Var(X) = µ1
n
¶2Xn
i=1
Var(Xi) by (∗)
= σ2 n (n- 1)S2 =
Xn
i=1
(Xi−µ+µ−X)2
= X
i
(Xi−µ)2+X
i
(X−µ)2−2(X−µ)X
i
(Xi−µ)
= Xn
i=1
(Xi−µ)2−n(X−µ)2. Take Expectations:
(n−1)E[S2] = Xn
i=1
E(Xi−µ)2−nE(X−µ)2
= nσ2−nV ar(X)
= (n−1)σ2
∴ E[S2] =σ2
Example .
The variance of a binomial random variable X with parametersn&p.
Solution.
X : # of successes innindependent trials, each with success probability p.
X=X1+....+Xn. Xi =
½ 1 ifith trial is a success (bernoulli) 0 else
Var(X) = Var(Xi) +...+ Var(Xn) Var(Xi) = E[Xi2]−(E[Xi])2
by replacing Xi=Xi2 Var(X) = E[Xi]−(E[Xi])2
=p−p2
∴ Var(X) =np(1−p).
Definition
The Correlation of two random variablesX&Y is ρ(X, Y) = Cov(X, Y)
pV ar(X)V ar(Y) (as long as Var(X) Var(Y)> 0)
• -1 ≤ρ(X, Y)≤1
Proof.
Suppose thatX&Y have variances σx2 & σy2 Then
0 ≤ Var µX
σx + Y σy
¶
= V ar(X)
σ2x +V ar(Y)
σ2y +2Cov(X, Y) σxσy
= 2 [1 + ρ(X, Y)]
∴ ρ(X, Y)≥ −1 0 ≤ Var
µX σx − Y
σy
¶
= V ar(X)
σ2x +V ar(Y)
σ2y −2Cov(X, Y) σxσy
= 2 [1 - ρ(X, Y)]
∴ ρ(X, Y)≤1
Remarks
Var (Z) = 0 ⇒ Z is constant with probability 1 (to be proved in chapter-8) ρ(X, Y) = 1 ⇔ Y = a + bX, b = σσyx >0
ρ(X, Y) =−1 ⇔ Y = a + bX, b = -σσy
x <0
Remarks
The correlation coefficient is a measure of the degree of linearity betweenX&Y.
-1 0 1
Y tend to decrease when X increases
X, Y are Uncorrelated
Y tend to increase when X does
High degree of linearity between X & Y
Example .
IA, IB : indicator variables for the events A & B.
Then
E[IA] = P(A) E[IB] = P(B) E[IAIB] = P(AB) So,
Cov(IA, IB) = P(AB) - P(A) P(B)
= P(B)[P(A|B) - P(A)]
Positively Correlated⇒depending on⇒P(A|B)>P(A) UnCorrelated⇒depending on⇒P(A|B) = P(A) Negatively Correlated⇒depending on⇒P(A|B)<P(A)
Example .
X1, ..., Xn : i.i.d., variance σ2 Then,
Cov (Xi - X,X) = Cov (Xi,X)−Cov (X,X)
= Cov (Xi, 1 n
X
j
Xj) - Var(X)
= 1 n
X
j
Cov(Xi, Xj)−σ2 n
= σ2 n −σ2
n = 0
Remarks
X and Xi−X are uncorrelated, but in general they are not independent.