Mathematical Mathematical Foundations of Computer Networkingof Computer Networking

These chapters provide the basis for the four theories covered in the last half of the book: queuing theory, game theory, control theory and information theory. The Discrete-Time-and-Frequency Fourier Transform and the Fast Fourier Transform (FFT) 157 The Fast Fourier Transform 159 .

Probability

Introduction

Outcomes
Events
Disjunctions and conjunctions of events
Axioms of probability
Subjective and objective probability

Continuing with example 1, we define the mutually exclusive events {1,2} and {3,4} that both have a probability of 1/3. The axiomatic approach does not care how the probability of an event is determined.

Joint and conditional probability

Joint probability
Conditional probability
Bayes’ rule

Let P(UDP) denote the probability that the packet is of type UDP and let P(52) denote the probability that the packet is of length 52 bytes. This allows us to calculate the probability of any one of the priors Ei, conditional on the occurrence of the posterior F.

Random variables

Distribution
Cumulative density function
Generating values from an arbitrary distribution
Expectation of a random variable
Variance of a random variable

We can use a similar approach to generate values from a continuous random variable with the associated density function f(Xc). Intuitively, the expected value of a random variable is the value we expect it to take, knowing nothing else about it.

Figure 2. Generating values from an arbitrary (a) discrete or (b) continuous distribution.

Moments and moment generating functions

Moments
Moment generating functions
Properties of moment generating functions

Thus, the MGF represents all moments of the random variable X in a single compact expression. If the random variable X has an MGF M(t), then the MGF of the random variable Y = a+bX eatM(bt).

Standard discrete distributions

Bernoulli distribution
Binomial distribution
Geometric distribution
Poisson distribution

What is the probability that we will see at least one packet on the link during any one second interval. The probability that we see at least one packet on the link during a one-second interval is therefore:

Figure 3. Example Binomial distribution.

Standard continuous distributions

Uniform distribution
Gaussian or Normal distribution
Exponential distribution
Power law distribution

The expected value of a Gaussian random variable with parameters and is and its variance is. Suppose the time taken by a teller at a bank is an exponentially distributed random variable with an expected value of one minute.

Figure 4. Gaussian distributions for different values of the mean and variance.

Useful theorems

Markov’s inequality
Chebyshev’s inequality
Chernoff bound
Strong law of large numbers
Central limit theorem

Assume that the probability of success of each iteration is independent of the others (this is critical!). First, we calculate the MGF of the sum of n random variables in terms of the MGFs of each of the random variables.

Jointly distributed random variables

Bayesian networks

This greatly reduces the amount of information required to describe the joint probability distribution of the random variables. The joint distribution of the random variables (L, A, D, T, R) will assign a probability to each possible combination of the variables, such as p (packet loss AND no ack loss AND no duplicate ack AND timeout AND no retransmission) .

Figure 10. A Bayesian network to represent TCP retransmissions

Exercises

Assuming that outgoing calls are independent and that a guest room has 10 minutes of outgoing calls during the busiest hour of the day, what is the probability that five calls are active at the same time during the busiest hour. You are told that the time between meteors is exponentially distributed, with an average of 200 seconds.

Statistics

Sampling a population

Types of sampling
Scales
Outliers

Targeted: Here, the idea is to sample only elements that meet a specific definition of the population. Convenience: A convenience sample involves studying those elements of the population that happen to be conveniently available.

Describing a sample parsimoniously

Tables
Bar graphs, histograms, and cumulative histograms
The sample mean
The sample median
Measures of variability

The variance of the sample mean (ie, the variance of the sampling distribution of the mean) can be calculated as follows. Therefore, the variance of the sample mean is 1/n of the variance of the population variance.

Inferring population parameters from sample parameters

The variance of the sample distribution (ie of X) is σ2/n, so it has a narrower spread than the population (with the spread decreasing as we increase the number of elements in the sample). We can obtain the appropriate confidence intervals for the population variance by studying the sampling distribution of the var-.

FIGURE 4. Population and sample mean distributions

Testing hypotheses about outcomes of experiments

Hypothesis testing
Errors in hypothesis testing
Formulating a hypothesis
Comparing an outcome with a fixed quantity
Comparing outcomes from two experiments
Testing hypotheses regarding quantities measured on ordinal scales
Fitting a distribution
Power

The power of a statistical test is the probability that the test rejects a null hypothesis when it is in fact false. Then the probability that we reject the null hypothesis (which is false) is essentially the same as the significance level (why?).

Independence and dependence: regression, and correlation

Independence
Regression
Correlation

We calculate the chi-square statistic as the sum of squares of the variable (observed value - expected value)2/(expected value). It is informative to look at the scatterplot of the data, shown in Figure 5(a).

FIGURE 5. Scatter plot of loss rate vs. buffer size

Comparing multiple outcomes simultaneously: analysis of variance

One-way layout
Multi-way layouts

SSB/(I-1) is also an unbiased estimator of the population variance because it is an unbiased estimator. For example, to quantify the degree of effect, we can calculate the regression of the observed effect as a function of the treatment.

Design of experiments

For example, we might want to study the joint effect of buffer size and traffic load on the loss rate. The details of this so-called two-sided representation are beyond the scope of this text.

Dealing with large data sets

Note that to perform repeated joins, we need to define the distance between a point and a cluster and between two clusters. The distance between a point and a cluster can be defined either as the distance from that point to the nearest point in the cluster or as the average of all the distances from that point to all the points in the cluster.

Common mistakes in statistical analysis

What is the population?
Lack of confidence intervals in comparing results
Not stating the null hypothesis
Too small a sample
Too large a sample
Not controlling all variables when collecting observations
Converting ordinal to interval scales
Ignoring outliers

Remember that we can only reject or not reject the null hypothesis from observational data. If the sample size is too large, a sample that deviates even slightly from the null hypothesis will cause the null hypothesis to be rejected.

Exercises

If the number of peers were independent of uplink capacity, what is the expected value of the number of peers for a given uplink capacity?. Using the chi-square test, can we conclude that the number of peers is independent of uplink performance at the 95% and 99.9% confidence levels?.

Linear Algebra

Vectors and matrices

Unlike a vector, whose elements may not be related, elements in the same column of a matrix are usually related to each other. Existence of distinct additive and multiplicative identity elements in the set: There are distinct elements denoted by "0" and.

Vector and matrix algebra

Addition
Transpose
Multiplication
Square matrices
Exponentiation
Matrix exponential

The product of two vectors can be defined as either a dot product or a cross product. Therefore, the product of an n-dimensional row vector - a matrix of size - with a matrix is a row vector of dimension n.

Linear combinations, independence, basis, and dimension

Linear combinations
Linear independence
Vector spaces, basis, and dimension

Note that if a set of vectors is not linearly independent, each of them can be rewritten in terms of the others (why?). What is the set of vectors that can be created as linear combinations of this set.

Solving linear equations using matrix algebra

Representation
Elementary row operations and Gaussian elimination
Rank
Determinants
Cramer’s theorem
The inverse of a matrix

This vector can be expressed as a linear combination of the basis set as x = ax1 + bx2 + cx3. Note that the rank of a matrix is equal to the cardinality of the basis set of the corresponding set of row vectors.

Linear transformations, eigenvalues and eigenvectors

A matrix as a linear transformation
The eigenvalue of a matrix
Computing the eigenvalues of a matrix
Why are eigenvalues important?
The role of the principal eigenvalue
Finding eigenvalues and eigenvectors
Similarity and diagonalization

As we have seen, the eigenvalues of the matrix are the roots of the characteristic polynomial. Then it can be easily shown, by expanding the underlying terms, that the matrix.

FIGURE 1. The effect of a matrix on an eigenvector is to scale it, preserving its direction.

Stochastic matrices

Computing state transitions using a stochastic matrix

We have combined these results to argue that the power method can be used to calculate the stationary probability distribution of a stochastic matrix. The power technique for finding the dominant eigenvector of a stochastic matrix can be used to rank a series of web pages.

FIGURE 2. Largest possible eigenvalue of a stochastic matrix

Exercises

Use the power method to calculate the dominant eigenvalue and the corresponding eigenvector of the matrix. If it is known that the initial state is state 1 with probability 0.5 and state 2 with probability 0.5, calculate the probability of being in these two states after two time steps.

Optimization

System modelling and optimization
An introduction to optimization
Optimizing linear systems

Network flow

Integer linear programming

Total unimodularity
Weighted bipartite matching

Dynamic programming
Nonlinear constrained optimization

Lagrangian techniques
Karush-Kuhn-Tucker conditions for nonlinear optimization

Heuristic non-linear optimization

Hill climbing
Genetic algorithms

Exercises

The optimal value of O is reached for values of xis corresponding to the optimal peak. We want to find the set of tuples of the form (x,y) that maximize f(x,y) subject to the constraint g(x,y) = c.

FIGURE 1. The objective function for Example 1. The maximum occurs when the control variable is set to 100 packets/s.

Signals, Systems, and Transforms

Introduction

Background

Sinusoids
Complex numbers
Euler’s formula
Discrete-time convolution and the impulse function
Continuous-time convolution and the Dirac delta function

Given the importance of rotating vectors (also called phasors), it is desirable to compactly display the current vector position on a unit disk. Note that each convolution value of x(t) and y(t) (ie at time t) is the result of summing all product pairs and.

FIGURE 1. A typical carrier wave described by a sinusoidal curve

Signals

The complex exponential signal

A signal that often appears in the study of transformations is the complex exponential signal denoted , where k is a real number and s is a complex quantity that can be written as It is worth studying these numbers carefully because they provide deep insight into the nature of a complex exponential that will greatly help in understanding the nature of transformations.

Systems

Types of systems

As discussed in Section 8.3.2 on page 226, in the context of control theory, the transfer function is more accurately described as the Laplace transform of the system's impulse response. At this stage, however, this loose but intuitively appealing description of the transfer function will suffice.

Analysis of a linear time-invariant system

The effect of an LTI system on a complex exponential input Consider an LTI system that is described by
The output of an LTI system with a zero input
The output of an LTI system for an arbitrary input
Stability of an LTI system

Therefore, the response of the system is its natural response (except at time 0 itself). Since the system is linear, the response of the system to a scaled impulse will be a scaled output, so that.

Transforms

Consider the complex exponential which represents a solution to the characteristic equation of an LTI system (Equation 23). Finally, if all the values of σ are 0, then the behavior of the system depends on whether there are repeated roots.

The Fourier series

Solution: The kth coefficient of the Fourier series corresponding to this function is given by. Note that the coefficients ck are real functions of τ (not t), which is a parameter of the input signal.

FIGURE 9. An infinite series of rectangular pulses

The Fourier Transform

Properties of the Fourier transform

Calculate the Fourier transform of a single rectangular pulse of height 1 and width centered on the origin (Figure 11). Calculate the Fourier transform of a rectangular pulse as in Figure 11 but with a pulse width of.

FIGURE 11. A single pulse and its Fourier transform

The Laplace Transform

Poles, Zeroes, and the Region of convergence
Properties of the Laplace transform

This is called the pole of the system (the pole in the previous example was at 0). If the region of convergence of the Laplace transform of a signal includes an imaginary axis, then the Fourier transform of the signal is defined and can be obtained by setting.

TABLE 3. Properties of the Laplace transform

The Discrete Fourier Transform and Fast Fourier Transform

The impulse train
The discrete-time Fourier transform
Aliasing
The Fast Fourier Transform

Due to the linearity of the Fourier transform, we can simply calculate the transform of a single value. This defines the discrete-time Fourier transform of the discrete function x[nT] and is denoted.

FIGURE 12. The effect of sampling on the Fourier transform of a signal

The Z Transform

Relationship between Z and Laplace transform
Properties of the Z transform

We now discuss how the z-auxiliary variable of the Z-transform relates to the s-auxiliary variable used in the Laplace transform. Note that the vertical line marked with a diamond lies in the left half-plane of the s-plane.

FIGURE 14. Mapping from the s plane to the z plane

22 Z transform

Stochastic Processes and Queueing Theory

Overview

A general queueing system
Little’s theorem

Stochastic processes

Discrete and continuous stochastic processes
Markov processes
Homogeneity, state transition diagrams, and the Chapman-Kolmogorov equations
Irreducibility
Recurrence
Periodicity
Ergodicity
A fundamental theorem
Stationary (equilibrium) probability of a Markov chain
A second fundamental theorem

In this case, the family of random variables corresponding to the stochastic process consists of the variables X(t1), X(t2),. Therefore, the random variable corresponding to the state of the process can take real values, and the corresponding stochastic process would be a continuous space process.

FIGURE 15. State transition diagram for Example 11

In an irreducible and aperiodic homogeneous Markov chain, the limiting probability distribution

Mean residence time in a state
Continuous-time Markov Chains

Markov property for continuous-time stochastic processes
Residence time in a continuous-time Markov chain
Stationary probability distribution for a continuous-time Markov chain

Birth-Death processes

Time-evolution of a birth-death process
Stationary probability distribution of a birth-death process
Finding the transition-rate matrix
A pure-birth (Poisson) process
Stationary probability distribution for a birth-death process

The M/M/1 queue
Two variations on the M/M/1 queue

The M/M/ queue: a responsive server
M/M/1/K: bounded buffers

We thus obtain the long-run probabilities of being in any state j as a function of the probability of being in state 0 and the system parameters. Note that the long-run probability that the population size j depends only on the use of the system.

Mathematical Mathematical Foundations of Computer Networkingof Computer Networking

Probability

Introduction

Joint and conditional probability

Random variables

Moments and moment generating functions

Standard discrete distributions

Standard continuous distributions

Useful theorems

Jointly distributed random variables

Exercises

Statistics

Sampling a population

Describing a sample parsimoniously

Inferring population parameters from sample parameters

Testing hypotheses about outcomes of experiments

Independence and dependence: regression, and correlation

Comparing multiple outcomes simultaneously: analysis of variance

Design of experiments

Dealing with large data sets

Common mistakes in statistical analysis

Further reading

Exercises

Linear Algebra

Vectors and matrices

Vector and matrix algebra

Linear combinations, independence, basis, and dimension

Solving linear equations using matrix algebra

Linear transformations, eigenvalues and eigenvectors

Stochastic matrices

Exercises

Optimization

Signals, Systems, and Transforms

Background

Signals

Systems

Analysis of a linear time-invariant system

Transforms

The Fourier series

The Fourier Transform

The Laplace Transform

The Discrete Fourier Transform and Fast Fourier Transform

The Z Transform

Further Reading

22 Z transform

Stochastic Processes and Queueing Theory

In an irreducible and aperiodic homogeneous Markov chain, the limiting probability distribution