Duplication of this publication or parts thereof is permitted only under the provisions of the copyright law of the publisher's location, in its current version, and permission for use must always be obtained from Springer. Acceleration was possible without sacrificing essential content through speedy use of the Internet.
Typical Problems of Data Analysis
This stochastic tendency is either rooted in the nature of the experiment (experimental animals are necessarily different, radioactivity is a stochastic phenomenon), or it is a consequence of the unavoidable uncertainties of the experimental equipment, i.e. measurement errors. It is often useful to simulate with a computer the variable or stochastic characteristics of the experiment to get an idea of the expected uncertainties of the results before the experiment itself is carried out.
On the Structure of this Book
Linear and polynomial regression, the subject of Chapter 12, is a special case of the method of least squares and is therefore already covered in Chapter 9. The most important concepts of computer graphics are introduced and all necessary explanations are given for the use of this class.
About the Computer Programs
In the analysis, you generally have to make an assumption about the distribution of the errors. If this assumption is not correct, then the results of the analysis are not optimal.
Experiments, Events, Sample Space
Often one or more special subspaces of the sample space are of particular interest. The entire sample space corresponds to an event that will occur in each experiment, which we call E.
The Concept of Probability
In the rest of this chapter, we will define what we mean by the probability that an event will occur and present rules for calculations with probabilities. In a large number of N experiments, the event is observed in times. as the probability of the occurrence of the event A.
Rules of Probability Calculus: Conditional Probability
The probability that the party will win the next election is 1/3.” As another example, consider the case of a certain trace in nuclear emulsion that could be left by a proton or pion. Two events A and B are said to be independent if knowing that A has occurred does not change the probability of B and vice versa, i.e. as.
Examples
Probability for n Dots in the Throwing of Two Dice
Lottery 6 Out of 49
That is exactly the reverse of the number of combinations C649 of 6 elements out of 49 (see Appendix B), since all these combinations are equally probable, but only one of them contains only the drawn numbers. The result of the drawing is an example of a set of 49 elements, 6 of which are of one type and 43 of the other.
Three-Door Game
Now we can say that there are two types of balls in the bowl, namely 6 balls which are of interest to the player because they bear the numbers he has chosen and 43 balls whose numbers the player has not chosen. The pattern itself contains 6 elements that are drawn without returning the elements to the container.
Random Variables: Distributions
Random Variables
Distributions of a Single Random Variable
If x can assume only a finite number of discrete values, eg, the number of points on the faces of a solid, then the distribution function is a step function. A trivial example of a continuous distribution is given by the angular position of a clock hand read at random intervals.
Functions of a Single Random Variable, Expectation Value, Variance, Moments
Like the variance itself, it is a measure of the average deviation of the measurementsx from the expected value. It is positive (negative) if the distribution is skewed to the right (left) of the mean.
Distribution Function and Probability Density of Two Variables: Conditional Probability
It is called the probability density of the Lorentz or Breit-Wigner distribution and plays an important role in the physics of resonance phenomena. For some studies, the dependence on the season may be of no interest.). is the probability density for x.
Expectation Values, Variance, Covariance, and Correlation
As with a single variable, the moments below are of particular importance. Therefore, in the case of linear dependence (b positive) between x and the correlation coefficient, it has the value ρ(x,y)= +1.
More than Two Variables: Vector and Matrix Notation
The probability density of one of the variables xr, the marginal probability density, is given by. 3.6.3) we can define the joint marginal probability density of n variables,* by integrating (3.6.3) only over n− remaining variables,.
Transformation of Variables
They bound the surface element dA of the transformed variables u, v corresponding to the element dxdy of the original variables. The area of the parallelogram is equal to the absolute value of the determinant dA=.
Linear and Orthogonal Transformations
Error Propagation
Computer Generated Random Numbers
The Monte Carlo Method
Random Numbers
Representation of Numbers in a Computer
If nmbits are available for the representation of the mantissa (including sign), it can be expressed by the integer. There are a fixed number of binary digits corresponding to a fixed number of decimal places available for representing the mantissaM.
Linear Congruential Generators
When calculating with floating-point numbers, the concept of the relative precision of the representation is very important. If binary digits are available for the representation of the mantissa, then one has M≈2n, since the exponent is chosen so that all n places for the mantissa are fully used.
Multiplicative Linear Congruential Generators
Before giving the theorem about the maximum period length for this case, we introduce the concept of the primitive element modulo. Theorem about the maximum period of an MLCG: The maximum period of an MLCG defined by the quantities m,a,c=0, x0 is equal to the orderλ(m).
Quality of an MLCG: Spectral Test
Because the xj are integers, the spacing between the horizontal or vertical grid lines on which the points (4.5.4) must lie is 1/m. If the distances between neighboring grid lines for all families are approximately equal, then we can be sure that we have a maximally uniform distribution of occupied grid points in the unit square.
Implementation and Portability of an MLCG
We now compute the right side of (4.4.1), omitting the indexj and noting that [(xdivq)m]modm=0, since x divq is an integer: . 4.6.6) In this way, both terms enclosed in square brackets in the last line of (4.6.4) are less than m, so that the bracketed expression remains in the interval between them. It should be noted that dividing two integer variables directly results in the integer part of the quotient.
Combination of Several MLCGs
However, the output values are floating point values due to the division bym, and therefore correspond to a uniform distribution between 0 and 1. The plots correspond only to a narrow strip at the left edge of the unit square.
Generation of Arbitrarily Distributed Random Numbers
- Generation with the von Neumann Acceptance–Rejection TechniqueAcceptance–Rejection Technique
If random numbers are uniformly distributed in the interval 0 Many other procedures have been described in the literature for generating normally distributed random numbers. They are somewhat more efficient, but generally more difficult to program than BOX–MULLER. Here a is the vector of expectation values and B =C−1 is the inverse of the positive-determined symmetric covariance matrix. One obtains vectors x of arbitrary numbers by first forming a vector u from elements ui that follow the standard normal distribution and then performing the transformation. We expect that when N points are distributed according to a uniform distribution squared 0≤y≤1,0≤u≤1, and when none lie inside the unit circle, the ration/N approaches the value I =π/4 in the limit N. In fact, we find fluctuations in the three columns at the first, second and third places after the decimal point. The Monte Carlo Method for Simulation Java Classes and Example Programs We define the probability density of the joint normal distribution of the future xi. For other values g(x)=const one obtains comparable ellipsoids that lie inside (g < 1) or outside (g > 1) of the covariance ellipsoid. If the elements of the population can be numbered, it is often convenient to use random numbers to select the elements for the sample. It is clearly an approximation for F(x), the distribution function of the population, which it approximates in the limit n→. Of course, you don't usually calculate the sample mean and variance by hand, but rather by the Sample class and its methods. The population mean is therefore the average of the individual means of the subpopulations, each weighted by the probability of its subpopulation. It is clear that the arithmetic mean x¯p cannot in general be an estimator for the sample mean, since it depends on the arbitrary choice of the sizeˆ ni of the samples from the subpopulations. The characteristic function of the sample mean is given in (6.2.2) in terms of the population characteristic function. To prove the general case, we first find the characteristic distribution function χ2 to be. 6.6.13). The factors in (6.7.3) ensure that yi has zero mean values and standard deviationσ. It can be shown that each additional connection between the squared terms reduces the number of degrees of freedom by one. Here, φ0 and ψ0 are the probability density and distribution function of the standard normal distribution presented in Section 5.8. This means that we have to construct the inverse function of the Poisson distribution for a fixed k and a given probability P (in our case α/2 and 1−α/2). Example 6.7: Determination of a lower limit for the lifetime of the proton from the observation of no decay. Of particular interest is the average lifetime of the proton, one of the primary building blocks of matter. In an experiment in which k events are recorded, the number of background events cannot simply be given by (6.9.3), since it is known from the experiment result that nB≤k. 6.9.4) This distribution is normalized to unity in the range 0≤nB ≤k. If only an upper bound at the confidence level 1−α is desired, then it is clear that, by analogy with Table 6.3, we have some numerical values computed by methods of the SmallSample class. 6.11.8) The probability of nS signal events regardless of the number of background events is . As with (6.9.6) and (6.9.7), the limits rS+ and rS− of the confidence range for rS with confidence level β=1−α can be determined from the following requirement Table 6.4 contains some numerical values calculated by methods of rS − have no meaning. The program calculates the limitsrS−,rS+ andrS(up) for the ratio of the number of signal-reference events in the limit of a large number of events. Sample Program 6.4: The E4Sample class demonstrates using the methods of the SmallSample class to compute confidence limits. Figure 7.1 illustrates the situation for different forms of the probability function for the case with a single parameter λ. We thus see that estimators achieve the minimum variance limit when the probability function is of the special form (7.3.14). For the estimator of the mean, the maximum likelihood method leads to the arithmetic mean of the individual measurements. If T is in region U, the critical region of the test, the hypothesis is rejected.
Generation of Normally Distributed Random Numbers
Generation of Random Numbers According to a Multivariate Normal Distributionto a Multivariate Normal Distribution
The Monte Carlo Method for Integration
Some Important Distributions and Theorems
Samples
Mean and Variance of a Sample
Graphical Representation of Samples
Histograms and Scatter Plots
Samples from Partitioned Populations
Samples from Gaussian Distributions: χ 2 -Distribution
Sampling by Counting: Small Samples
Small Samples with Background
Determining a Ratio of Small Numbers of Events
Ratio of Small Numbers of Events with Background
Java Classes and Example Programs
The Method of Maximum Likelihood
Confidence Intervals