MULTIPLE REGRESSION IN MATRIX NOTATION
3.4 Properties of Linear Functions of Random VectorsVectors
Note that β, Y, andeare random vectors because they are functions of the random vectorY. In the previous sections, these vectors are expressed as linear functionsAY ofY. The matrixAis
• (XX)−1X forβ,
• P forY, and
• (I−P) fore.
Before studying the properties of β, Y, and e, it is useful to study the general properties of linear functions of random vectors.
LetZ = (z1 · · · zn) be a random vector consisting of random vari- Random VectorsZ ablesz1, z2, . . . , zn. The meanµz of the random vectorZ is defined as an
n×1 vector with theith coordinate given byE(zi). The variance–covariance matrixVzforZis defined as ann×nsymmetric matrix with the diagonal elements equal to the variances of the random variables (in order) and the (i, j)th off-diagonal element equal to the covariance betweenziandzj. For
example, ifZis a 3×1 vector of random variablesz1, z2, andz3, then the E(Z) mean vector ofZ is the 3×1 vector
E(Z) =
E(z1) E(z2) E(z3)
=µz=
µ1 µ2
µ3
(3.15)
and the variance–covariance matrix is the 3×3 matrix Var(Z)
Var(Z) =
V ar(z1) Cov(z1, z2) Cov(z1, z3) Cov(z2, z1) V ar(z2) Cov(z2, z3) Cov(z3, z1) Cov(z3, z2) V ar(z3)
= Vz (3.16)
3.4 Properties of Linear Functions of Random Vectors 83
=
E[(z1−µ1)2] E[(z1−µ1)(z2−µ2)] E[(z1−µ1)(z3−µ3)]
E[(z2−µ2)(z1−µ1)] E[(z2−µ2)2] E[(z2−µ2)(z3−µ3)]
E[(z3−µ3)(z1−µ1)] E[(z3−µ3)(z2−µ2)] E[(z3−µ3)2]
= E{[Z− E(Z)][Z− E(Z)]}. (3.17)
Let Z be ann×1 random vector with meanµz and variance–covariance Linear Functions ofZ matrixVz. Let
A=
a1 a2 a...k
be ak×nmatrix of constants. Consider the linear transformationU =AZ.
That is, U is ak×1 vector given by U=AZ
U =
a1Z a2Z ak...Z
=
u1
u2
u...k
. (3.18)
Note that
E(ui) = E(aiZ)
= E[ai1z1+ai2z2+· · ·+ainzn]
= ai1E(z1) +ai2E(z2) +· · ·+ainE(zn)
= aiµz,
and hence E(U)
E[U] =
E(u1) E(u2) E(u...k)
=
a1µz a2µz ak...µz
= Aµz. (3.19)
Thek×k variance–covariance matrix forU is given by Var(U) Var(U) = Vu
= E[U− E(U)][U− E(U)].
84 3. MULTIPLE REGRESSION IN MATRIX NOTATION Substitution ofAZ forU and factoring gives
Vu = E[AZ−Aµz][AZ−Aµz]
= EA[Z−µz][Z−µz]A
= AE[Z−µz][Z−µz]A
= A[Var(Z)]A
= AVzA. (3.20)
The factoring of matrix products must be done carefully; remember that matrix multiplication is not commutative. Therefore,Ais factored both to the left (from the first quantity in square brackets) and to the right (from the transpose of the second quantity in square brackets). Remember that transposing a product reverses the order of multiplication (CD)=DC. SinceAis a matrix of constants it can be factored outside the expectation operator. This leaves an inner matrix which by definition isVar(Z).
Note that, ifVar(Z) =σ2I, then
Var(U) = A[σ2I]A
= AAσ2. (3.21)
Theith diagonal element ofAA is the sum of squares of the coefficients (aiai) of the ith linear functionui =aiZ. This coefficient multiplied by σ2 gives the variance of the ith linear function. The (i,j)th off-diagonal element is the sum of products of the coefficients (aiaj) of theith andjth linear functions and, when multiplied byσ2, gives the covariance between two linear functionsui=aiZanduj=ajZ.
Note that ifAis just a vectora, then u=aZ is a linear function of Z. The variance ofuis expressed in terms ofVar(Z) as
σ2(u) = aVar(Z)a. (3.22)
IfVar(Z) =Iσ2, then
σ2(u) = a(Iσ2)a=aaσ2. (3.23) Notice thataais the sum of squares of the coefficients of the linear function a2i, which is the result given in Section 1.5.
Two examples illustrate the derivation of variances of linear functions using the preceding important results.
Matrix notation is used to derive the familiar expectation and variance of Example 3.5 a sample mean. SupposeY1, Y2, . . . , Yn are independent random variables
with meanµand varianceσ2. Then, forY = (Y1 Y2 · · · Yn),
E(Y) =
µµ µ...
=µ1
3.4 Properties of Linear Functions of Random Vectors 85
and Var(Y) =Iσ2.
The mean of a sample ofnobservations,Y =
Yi/n, is written in matrix notation as
Y = (n1 n1 · · · n1)Y. (3.24) Thus,Y is a linear function ofY with the vector of coefficients being
a= (n1 n1 · · · n1). Then,
E(Y) = aE(Y) =a1µ=µ (3.25) and
Var(Y) = a[Var(Y)]a=a(Iσ2)a
= (n1 1n · · · n1) (Iσ2)
1n
1n
...
1n
= n
1 n
2
σ2= σ2
n. (3.26)
For the second example, consider two linear contrasts on a set of four Example 3.6 treatment means withnobservations in each mean. The random vector in
this case is the vector of the four treatment means. If the means have been computed from random samples from four populations with meansµ1,µ2, µ3, andµ4and equal varianceσ2, then the variance of each sample mean will beσ2/n(equation 3.26, and all covariances between the means will be zero. The mean of the vector of sample meansY = (Y1 Y2 Y3 Y4) isµ= (µ1 µ2 µ3 µ4). The variance–covariance matrix for the vector of meansY isVar(Y) =I(σ2/n). Assume that the two linear contrasts of interest are
c1 = Y1−Y2 and c2=Y1−2Y2+Y3.
Notice that Y4 is not involved in these contrasts. The contrasts can be written as
C = AY, (3.27)
86 3. MULTIPLE REGRESSION IN MATRIX NOTATION where
C = c1
c2
and A=
1 −1 0 0 1 −2 1 0
. Then,
E(C) = AE(Y) =Aµ=
µ1−µ2
µ1−2µ2+µ3
(3.28) and
Var(C) = A[Var(Y)]A=A
I σ2
n
A
= AA
σ2 n
= 2 3
3 6 σ2
n . (3.29)
Thus, the variance of c1 is 2σ2/n, the variance of c2 is 6σ2/n, and the covariance between the two contrasts is 3σ2/n.
We now develop the multivariate normal distribution and present some Multivariate Normal Distribution properties of multivariate normal random vectors. We first define a mul-
tivariate random vector when the elements of the vector are mutually in- dependent. We then extend the results to normal random vectors with a nonzero mean and a variance–covariance matrix that is not necessarily di- agnonal. Finally, we present a result for linear functions of normal random vectors.
Suppose z1, z2, . . . , zn are independent normal random variables with Normal Random Vectors mean zero and varianceσ2. Then, the random vectorZ= (z1 · · · zn)is
said to have a multivariate normal distribution with mean0= ( 0 · · · 0 ) and variance–covariance matrixVz=Iσ2. This is denoted as
Z∼N(0,Iσ2).
The probability density function of Z is given in equation (3.3) and can also be expressed as
(2π)−n/2|Iσ2|−1/2e− Z(Iσ2)−1Z/2!
. (3.30)
It is a general result that ifU is any linear functionU=AZ+b, whereA is ak×nmatrix of constants andbis ak×1 vector of constants, thenU is itself normally distributed with mean µu= b and variance–covariance matrixVar(U) =Vu=AAσ2(Searle, 1971). The random vectorU has a multivariate normal distribution which is denoted by
U ∼N(µu,Vu). (3.31)