RANDOM EVENTS AND THEIR PROBABILITIES
2.2. Polynomial scheme. The polynomial distribution
47
We can write
1 1
( , 1) ! !( )! ( )
( , ) ( 1)!( 1)! ! ( 1)
k n k
k n k
P n k n p q k n k p n k
P n k k n k n p q q k
.
Further, since the inequality P n k( , !1) P n k( , ) is equivalent to the inequality (n1)p k! 1, then, if (n1)p k! 1 (or np q k ! ) then the probability P n k( , ) increases with the transition from k to k1; conversely, if (n1)p k 1 (or
np q k), then P n k( , ) decreases with the transition from k to k1. If (n1)p k 1 (or np q k ), then P n k( , 1) P n k( , ).
Definition. The value of k, at which the probability P n k( , ), as a function of k, takes the greatest value
, 0max ( , )P n k k nP n k
d d is called the most likely number of successes.
From the definition and the above arguments we obtain the following statement:
If (n1)p is not an integer, then k ª ¬
n 1pº¼, where [a] is the integer part of number a;If (n1)p is an integer, then there are two most likely numbers of success:
( 1) 1
k n p and k (n1)p.
In conclusion, we note that for the sequence of n independent Bernoulli trials with the probability of success p the probabilities of events
a) not once there was a success, b) at least once there was a success,
c) there were at least k1 and at most k2 successes can be found by the follo- wing formulas (prove!):
а) P n( ,0) qn (1 p) ,n b)
1
( , ) 1 ( ,0) 1 1 (1 )
n n n
k
P n k P n q p
¦
, (2)c) 2 2
1 1
( , )
k k
k k n k
n
k k k k
P n k C p q
¦ ¦
.48
Then (corresponding to a sequence of n independent experiments) the sample space is:
^
Z Z Z1, ,...,2 Zn , Zi 0, i 1,2,..., , n`
: : : 0
^
a1,...,ar`
.We denote by Qi(Z) the number of equal outcomesai of the sequence
1, 2,..., nZ Z Z Z . In other words, Qi(Z) means the number of occurrences of the out- come Ai in n trials:
{ : }
1
( ) ( ),
j i
n
i a
j
IZ Z
Q Z
¦
Z (3) where ܫʏሺZሻ is an indicator of an event А:1, ; 0, .
A A
I Z ZA I Z ZA
Let’s now determine the probabilities of elementary events Z: by the for- mula:
1( ) ( ) ( )
1 2
( ) ... rr ,
P Z pQ Z pQ Z pQ Z (4) where pi !0, p1p2 ... pr 1.
Let’s show that the definition of probability by the formula (4) is correct, i.e.
1,
ω Ω
P Ρ(ω)
:
¦
Really,
1 2
1 2 ... r r
P pQ Z pQ Z pQ Z
Z Z
Z
: :
¦ ¦
=
^ `
1 2
1 1 1
1 2
1 2
0,..., 0 : ( ) ... ...
( ) .
... r
r r
r r
n n n
n n n r
n n n n
n
p p p
ZQ Z Q Z t t
¦ ¦
=
^ `
1 2
1
1 2
1 2 1 2
0,..., 0 ...
( , ,..., ) ... r,
r r
n n n
n r r
n n
n n n n
C n n n p p p
t t
¦
where C n nn( , ,..., )1 2 nr is total number of elementary events Z
Z Z1, 2,...,Zn: with n1 elements a1, n2 elements a2,..., nr elements ar.Then, by the formula (16) from § 1,
1 2
1 2
( , ,..., ) !
! ! ... !
n r
r
C n n n n
n n n .
49
Therefore
1 1
1 2
1 1 2
{ 0,..., 0} 1 2
{ ... }
( ) ! ... ( ... ) 1,
! ! ... !
r
r r
n n n
r r
n n r
n n n n
P n p p p p p
n n n
Z
Z
: t t
¦ ¦
and this proves the correctness of the definition (4).
Let an event An1,n2,...,nr mean that as a result of n independent tests the event A1 appeared n1 times, ..., the event Ar appeared nr times:
^ `
1 2, ,..., r : ( )1 1, ( )2 2,..., ( ) .
n n n r r
A
Z
:Q Z
nQ Z
nQ Z
nThe probability of this event is equal to
1 2
1, ,...,2 1 2 1 2
1 2
( ) ( ; , ,..., ) ! ... .
! ! ... !
r r
n n n
n n n r r
r
P A P n n n n n p p p
n n n
(5)
A set of probabilities
^
P n n n( ; , ,..., )1 2 nr`
is called a polynomial (or multinomial) distribution, and the constructed probability model is called a polynomial scheme. It is clear that the binomial scheme is a special case of a polynomial scheme.If n 1, i.е. only one experiment takes place, then the event
^
Qi 1,Qj 0,j iz`
means that only the event Ai occurs as a result of the experiment, and by the formula (5) the probability of appearance of this event is P Ai pi.
Thus, for a polynomial scheme, probabilities p p1, 2,...,pr have the meaning of the probabilities of appearances (for one experiment) of events A A1, 2,...,Ar, respectively.
Similarly, for iz j we have P A A
i j p pi j P A P Ai j , etc., which means independence of events A A1, 2,...,Ar.2.3. Hypergeometric and multidimensional hypergeometric distributions Let the general population :0 contain n1 elements of the first kind a a1, ,...,2 an1; n2 elements of the second kind b b1, ,...,2 bn2, total n1n2 n elements:
:0=
^
a a1, ,...,2 an1;b b1, ,...,2 bn2`
, :0=n n1 2 n.The question is: if from this general population a random sample of volume
k
is taken out without replacement, then, what is the probability that among them there will be exactly k1 elements of the first kind and exactly k2 k k1 elements of the second kind? (It is clear that k n kd , i dmin( , ), n ki i 1,2).
50
This problem can be formulated differently: from an urn, containing n1 white and n2 nn1 black balls,
k
balls are randomly selected. What is the probability that exactly k1 white and k2 k k1 black balls will be among them?The space of elementary events corresponding to this problem can be described, for example, as follows:
^
Z Z Z1, ,...,2 Z Zk : i 0,Z Zi j (i j), ,i j 1,2,..., .k`
: : z z
Then : ( ) ,n k and the number of elements of : with k1 elements of the first kind and k2 elements of the second kind is equal to C nkk1( ) ( ) .1 k1 n2 k.2 Then the required probability, according to the classical definition, is
1 2 1 1
1 2 1 2 1 1
1 1
1 2
1 .
k k k k k
k k n n n n n
k
n,n k k k
k n n
(n ) (n ) C C C C Ρ (k,k ) C
(n) C C
(6)
A set of probabilities
^
Pn n,1( , )k k1`
is called a hypergeometric distribution.In another way, this distribution could be defined as follows: from n balls, those in the urn, we can select k balls by Cnk ways; and from n1 white and n2 nn1 black balls we can select k1 white and k2 k k1 black balls by Cnk11Cnknk11 ways (because any set of black balls can be combined with any set of white balls).
Using the binomial coefficients, we see that the probabilities Pn n,1
k k, 1 can also be calculated using the following formula:1 1 1
1 1
, , 1
k n k
k n-k
n n n
n
C C P k k
C
. (6′) In the formulas (6) and (6*), as we have already noted, k1 0,1,2,...,min
n k1, .As for s m! we have that Cms 0, then for all k1!n1 or k1!k the probabilities, defined by the formulas (6), (6′) are equal to zero. Considering this, we can assume in the formulas (6), (6′) that k1 varies from 0 to k.
The numbers P k kn n,1( , )1 form a probability distribution, therefore 1
1
, 1
0
( , ) 1
k n n k
P k k
¦
,and, summing up (6) by k1 from 0 to k, we obtain the following property of the binomial coefficient:
1 1
1 1
1 0
k k k k k
n n n n
k
C C C
¦
. (7)51
If now the population :0of size n contains n1 elements of the 1-st type
1, ,...,2 n1
a a a ; n2 elements of the 2-nd type
1, ,...,2 n2;
b b b ..., nr elements of type r
1, ,...,2 nr
c c c (r2, n1+n2+...+nr=n), then, repeating the above arguments, we obtain: the probability that in a randomly selected from a general population sample of size k without replacement we will have exactly k1 elements of the 1-st type, k2 elements of the 2-n type, …, kr elements of type r, is equal to
1 2
1 2
1 1 .
r r r
k k k
n n n
n,n ,...,n r k
n
C C ... C Ρ (k,k ,...,k )
C
(8)
The set of probabilities
^
Ρn,n ,...,n1 r(k,k ,...,k )1 r`
is called a multidimensional hyper- geometric distribution.The structure of hypergeometric (especially multidimensional hypergeometric) distribution is very complex. For example, the probability Pn n,1( , )k k1 contains nine factorials. Therefore, the questions of finding formulas for the approximate calculation of the probabilities of a hypergeometric distribution are very important.
Let's give one result in this direction.
Theorem 1 (Approximation of the hypergeometric distribution by the binomial distribution). Let n1n2 n; no f, n1of, but so that nn1 o p
> @
0,1 , i.e.> @
2 1 0,1
n p
n o . Then
1 1
1 1 1 1 1
,1( , )1 ( , )1 (1 ) .
k k k
n n n k k n k
n n k
k n
C C P k k P k k C p p
C
o (9)
Proof. In the formula (6) obtained for the probabilities
,1( , )1
Pn n k k let’s divide the numerator and the denominator by nk. Then, we go to the limit when no f, n1of. Using the conditions of the theorem, we have
1 2
1 2
1
1 1 1 1 2 2 2 2
1
1 2
! 1 1 1 1
! ! 1 2 1
k k
n n
n,n k
n
C C k n (n ) ... (n k ) n (n ) ... (n k ) Ρ (k,k )
C k k n(n )(n ) ... (n k )
1 1 1 1 1 2 2 2 2
1 2
1 2 1 1 1
!
1 2 1
! ! 1 1 1 1
n n n n k n n n k
... ...
k n n n n n n n n n n n n
k k ... k
n n n
§ ·§ · § · § · § ·
¨ ¸¨ ¸ ¨ ¸ ¨ ¸ ¨ ¸
© ¹© ¹ © ¹ © ¹ © ¹
o
§ ·§ · § ·
¨ ¸¨ ¸ ¨ ¸
© ¹© ¹ © ¹
1 1 1
1 , )1
k k k k
C p (k p) Ρ(k k
o . ז
52
The assertion of the theorem shows that under the assumed assumptions the hypergeometric distribution is approximated by the binomial distribution, which is intuitively clear, because if n and n1 are large then the choice without replacement should give almost the same result as the choice with replacement.
As we see, when calculating probabilities using binomial or hypergeometric distributions, we must calculate the factorials of sufficiently large numbers. The numbers n! grow very rapidly with increasing n (e.g., 15! = 1 307674368000, аnd 100! contains 158 digits). Therefore, both from the theoretical and from the computa- tional point of view, the well-known Stirling formula is important: for
n !! 1
(n is a sufficiently large number)! 2 12n 0 1
θ
n n n
n πnn e e , θn . (10) Example (Playing in Sports Lottery). An urn contains 49 (forty-nine) identical balls with numbers 1, 2, ..., 49 and among them there are 6 (six) balls with winning (lucky) numbers. Six (6) balls are selected from the urn at random without replacement.
Winning is determined by the number of winning balls among the selected balls. Find the probability pr of retrieving r (r = 0,1,2,…,6) balls with winning numbers from the urn.
Solution. As you can easily guess, the required probabilities can be found with the aid of a hypergeometric distribution: the total number of balls in the urn is n = 49, n1 = 6 of them are winning; we selected k 6 balls, k1 r of the last ones are winning.
Then, by the formula (6)
6 6 49
49 6 6
49
6 0 1 6
r r
r ,
p Ρ ,r C C r , ,...,
C
.
Calculations show that
0 0.435965
p | ; p1 |0.413019; p2 |0.132378; p3|0.017650;
4 0.000969
p | ; p5 |0.000018; p6|0.00000007151.
So, the probability of a maximal gain is p6|7.2 10 , 8 and the probability of a gain (i.e. the probability of drawing from an urn at least three balls with the winning numbers) is equal to
3 4 5 6 1 ( 0 1 2) 1 0.981362 0.018638
p p p p p p p p .
In connection with the hypergeometric distribution, we make one remark about the nature of problems in the probability theory and mathematical statistics. Knowing the composition of the general population, we can find out the form of the sample by
53
using the hypergeometric distribution. This is a typical direct probabilistic problem.
But often it is necessary to solve inverse problems, i.e. determine the nature of the population by the composition of the samples. Such kind of (figuratively speaking) inverse problems forms the content of mathematical statistics.
2.4. Tasks for independent work
1. Show that for a polynomial distribution
^
Ρ(An1,n2,...,nr)`
(see Formula (5)), the greatest probability value is attained at a point (k1,k2,...,kr) that satisfies the inequalityr i
p r n k
npi1 i d( 1) i, 1,2,..., .
2. Using probabilistic considerations, prove the following relations:
a) n n
k k
Cn 2
¦0 ;
b) n nn
k k
n C
C 2
0
)2
¦( ;
c) 1
0
2
¦n n
k
nk n
kC
d) ( 1) ( 1)2 2 ( 2).
0
t
¦nk k C nn n n
k
k n
3. There are n white and m black balls in the urn (nt2 ,mt2). Two balls are taken at random (without replacement) out of the urn. Find the probabilities of events:
a) balls have the same color;
b) balls have different colors.
4. There are n tickets with the numbers 1,2,...,n, there are r winning tickets among them.
Someone bought r tickets. Find the probability that at least one of his tickets is winning.
5. The numbers 2, 4, 6, 7, 8, 11, 12 and 13 are written on eight cards. Two cards are chosen at random. Find the probability that you can reduce the fraction which is composed of two numbers written on these cards.
6. To reduce the number of teams-participants in a sports competition, 2n people participating in competitions are divided into two equal groups. Find the probability that the two strongest teams will fall into:
a) different groups;
b) one group.
7. Ten numbers are randomly chosen from the set of numbers 1, 2, ..., 20. Find the probabilities of the following events:
A=^all numbers are even`; B=^exactly three numbers are multiples of four`;
C=^there are five even and five odd numbers, and exactly one number is a multiple of ten`.
8. One of the non-empty subsets of an n-element set is randomly chosen. Find the probability that the selected subset contains an even number of elements.
9. Five multi-colored balls are in the urn. The sample of size 25 with replacement is taken out of the urn. Find the probability that by five balls of five different colors will be in the sample.
10. Two balls are randomly selected (with replacement) from an urn containing white and black balls. Prove that the probability of choosing balls of the same color is not less than 1/2.
54
§3. Geometric probabilities
Let :
^ `
Z be a bounded subset of an n-dimensional Euclidean space Rn. We will assume that for : the concept of «volume» makes sense (for n 1 – length, for2
n – area, for n 3 – usual volume, etc.) We denote by E E : the system of subsets of : (events), which have «volumes» and for any event AE : we will determine its probability by the relation
mes A
P A mes : , (1) where mes (A) is the «volume» of the event (the set) А.
The definition of probability by the formula (1) is called a geometric definition of probability.
The constructed model can be considered as a model of an experiment consisting of random throwing of a point into the domain : (Here and in the following we will understand an expressions of the type «The point is randomly thrown into the area :» or
«The random point is uniformly distributed in the domain :» as «The point dropped at random to the area : can reach any point of the area :, and the probability of this point falling into some part А of the area : is proportional to the «volume» of this part and does not depend on the form and location of this part in :»).
Examples
1. A random point is placed on a segment of length l (say, a segment
> @
0,l ), asa result, the segment is divided into two parts.
Find the probability that the length of a larger segment does not exceed 4 5l (event A).
Solution. Denote by x the length of one of the segments, then the length of the second segment is equal to l x (Fig. 1).
x
0 l l-x
Fig. 1
Them the sample space is
^
x: 0 x l` > @
0, ,l : d d and the desired event55
4 1 4
{ : max( , ) } [ , ].
5 5 5
A x: x l x d l l l Therefore, we have by formula (1)
3 5 3
5 mes(A) l
Ρ(A) .
mes(Ω) l
2. At the random moment of time x a signal of length ' appears on the time seg- ment
> @
0,T . The receiver is switched on at a random time point y> @
0,T for a time t.Find the probability of detecting the signal by the receiver.
Solution. The sample space is the domain
^
( , ) : 0x y x y T,` > @ > @
0,T 0, .T: d d u
If first a signal appears, and the receiver is connected later, i.e. if x yd , then the signal is detected only when y x d '.
Similarly, if yd x, then the signal can be detected only in the case if y x tt Thus, the event we need
^
(x,y) :y x ,y x, x y t,x y`
A : d' t т.е. d t
is the area that is shaded in the Fig. 2.
Fig. 2
We can find the probability we need using the formula (1):
2 2 2 2 2
2
1 1
( ) ( ) 1 1
2 2
( ) 1 1 1 .
2 2
T T T t t
P A T T T
' §¨© '·¸¹ §¨© ·¸¹ (2)
56
3. A task about a meeting. Two people, А and В, agreed to meet in the time interval
> @
0,T . If A (or B) arrives first to the agreed location, then he will wait for B (or A) for a period of time ' (or t) and in case of non-appearance of the latter leaves.Find the probability that the meeting will take place.
Solution. Note that this is a differently rephrased task 2: signal is А, receiver is В. Consequently, the probability of the meeting is found from formula (2).
Ex, if T 1 hour, t 15 minutes, ' 20 minutes, then p = 143/288. If T 1 hour, t ' 15 minutes, then p = ͳΤ .
4. The Buffon Problem. Parallel straight lines, separated by a distance of 2a
are in the plane. A needle with a length of 2l (l a ) is randomly thrown at this plane.
What is the probability that the needle will cross one of the parallel straight lines?
Solution. First, we describe the space of elementary events corresponding to this experiment. Let xis a distance from the center of the needle to the nearest straight line, M is the angle between the needle and the nearest straight line. Then the pair ( , )
M
x fully determines the location of the needle needed for selection of a specific straight line (Fig. 3). Since it is sufficient for us to know the position of the needle and the nearest straight line, the space of elementary events : is a rectangle^
( , ) : 0M x M S, 0 x a` > @ > @
0,S 0,a: d d d d u .
The needle can intersect the straight line only if the condition хdlsinM takes place. Therefore, the event we need A
^
( , )M x :: x ld sinM`
is the region sha- ded in Fig. 4. Then0
( ) sin 2
( ) .
( )
l d
mes A l
P A mes a a
S M M
S S
:
³
Fig. 3 Fig. 4
3.1. Tasks for independent work
1. Three points are placed at random into the semi-straight line
>
0 ,f .Find the probability that we can make a triangle from the segments formed from the point zero («0») to the given three points.
57
2. Two points are placed at random into the segment of length of l.
Find the probability that a triangle can be made from the three formed segments.
3. Three points, one after another, are put at random on the segment of a line. Find the probability of hitting a third point between the first two points.
4. A random point X is placed on a segment AB of length a, then a random point Y is placed on a segment of length b.
Assuming that the points A, B, C are on the line in this order, find the probability of forming a triangle from the segments AX, BY, XY.
5. A random point is thrown into the sphere of radius R.
Find the probability that the distance from this point to the center of the sphere does not exceed r.
6. A random point is placed in the square.
Find the probability that the distance from this point to the vertices of the square exceeds half of the length of the side of the square.
7. A random point A is placed in the square with the side a.
Find the probability that the distance from A to the nearest side of the square does not exceed the distance from A to the nearest diagonal of the square.
8. The point X is randomly placed on a semicircumference C
^
(x,y):x2y2 R2,yt0`
. Find the probabilities of the following events:a) the abscissa of the point X lies on the segment [– r, r];
b) the ordinate of the point lies on the segment
> @
r,R.9. The plane is marked with parallel straight lines at the same distance a from each other. The coin (circle) of radius r )
( 2a
r is randomly thrown to the plane.
Find the probability that the coin does not intersect any straight line.
10. The Bertrand Paradox. Two points are randomly chosen in a circumference of radius r.
They are connected by a chord.
Find the probability that the length of the chord will exceed 3r (that is, the length of the side of an equilateral triangle inscribed in the circle).
11. Continuation. The point is randomly chosen in a circumference of radius r; a diameter is drawn through it. A random point (the middle of the chord that is perpendicular to the diameter) is taken on the diameter.
Find the probability that the length of the obtained chord will surpass 3r.
12. Continuation. The point is placed at random inside a circle of radius r. This point is the middle of the chord that is perpendicular to the diameter passing through it.
Find the probability that the length of the obtained chord will surpass 3r.
13. Two points are placed at random into segments [– a, a], [– b, b], a!0 ,b!0, p and q are their coordinates (respectively).
Find the probability that the roots of the quadratic equation x2pxq 0 are real numbers.
14. The segment of length of a1+a2 is divided into two parts of the length a1 and a2,respec- tively. The n points are randomly placed on this segment.
Find the probability that exactly m out of n points will be placed on a part of the length a1. 15. Continuation. The segment of length a1a2...as is divided into s parts of the length
as
a
a1, 2,..., . The n points are randomly placed on this segment.
Find the probability that m1,m2,...,ms(m1m2...ms n) points will be placed on parts of lengths a1,a2,...,as(respectively).
58
59
Chapter