Generalised vec operators and the seemingly
unrelated regression equations model with
vector correlated disturbances
Darrell Turkington
*
Department of Economics, The University of Western Australia, Nedlands, WA 6907 Australia Received 3 April 1998; received in revised form 25 January 2000; accepted 13 March 2000
Abstract
This paper introduces operators which are generalisations of the vec operator. Proper-ties of these operators are discussed, some theorems involving these operators are presented and their relevance to matrix calculus demonstrated. These operators are used to facilitate the complicated di!erentiation required in applying classical statistical procedures to the SURE model with vector correlated disturbances. It is then shown that well-known statistical results concerning the linear regression model with autoregressive and moving average distrubances generalise to the SURE model with vector autoregres-sive and moving average disturbances. ( 2000 Elsevier Science S.A. All rights reserved.
JEL classixcation: C10; C30
Keywords: Generalised vec; SURE model; Vector correlated disturbances
1. Introduction
This paper investigates the extent that known statistical results concerning the linear regression model with autoregressive and moving average
distur-bances1 generalise to the seemingly unrelated regression equations (SURE)
*Corresponding author. Tel.:#08-9371-5856/#08-9380-2880; fax:#08-9380-1016. E-mail address:[email protected] (D. Turkington).
1For a detailed analysis of these models, see Turkington (1998b).
model with vector autoregressive and moving average disturbances. An essential
part of the analysis involves obtaining the asymptotic Cramer}Rao lower
bounds for the SURE model with the two di!erent types of disturbances. Recent
advances in zero}one matrices and matrix calculus have greatly facilitated the
complicated di!erentiation required in applying classical statistical techniques
to econometric models and use is certainly made of these results in our analysis. However, vector autoregressive disturbances and vector moving average distur-bances of necessity involves one working a lot with partitioned matrices and the
author found that the mathematical analysis is simpli"ed if use is made of
certain operators, which are generalisations of the well known vec operator. The"rst part of this paper is taken up with these generalised vec operators. In
Section 2 we de"ne the operators, and present some of their mathematical
properties. We then look at generalised vecs of commutation matrices and illustrate how these matrices arise in matrix calculus.
The second part of the paper involves a statistical analysis of the SURE model with vector autoregressive disturbances and the same model with vector moving average disturbances. Section 3 is reserved for the autoregressive case whereas Section 4 covers the moving average case. For each model we use the results
developed in the"rst part of the paper, concerning the generalised vec operators
to obtain the score vector, the information matrix and the asymptotic Cramer}Rao
lower bound. With these devices in hand we can readily see how the results pertaining to the linear regression model carry over to the SURE model. This is the subject matter of Section 5. The last section is reserved for a brief conclusion.
2. Generalised vec operators
2.1. Dexnitions
Consider a m]p matrix partioned into its columns A"(a
2An anonymous referee has pointed out that the vec
noperator has an interesting interpretation in
terms of balanced three-way classi"cations, or triple tensor products. LetAbe am]npmatrix and writeA"[a
ijk] wherei3[1,m],j3[1,n] andk3[1,p] andiis a row index whereaskandjare
lexicographically ordered column indices. In tensor product notation A"(a
ijkekji) where
ekji"e
j?ek?ejwithekandejthekth column ofIpand thejth column ofInrespectfully andeiis the
ith row ofI
m and the bracket ( , ) signifying summation over the three indices. Then under this
notation vec
nA"(aijkejki). That is the e!ect of the vecn operator is to convert the indexkfrom
a column index to a row index. One can then call on the literature of multilinear algebra, see for example Bourbaki (1958), Cartan (1952), Greub (1967), and Marcus (1973).
That is to form vec
2Awe stack columns ofAunder each other taking two at
a time. More generally ifAis them]npmatrixA"(a
For a givenm]KmatrixAthe number of generalised vec operations that can
be performed onAclearly depends on the number of columnsKofA. If K is
a prime number then only two generalised vec operators can be performed onA,
vec
1A"vecAand vecKA"A. ForKany other number, the number of
generalis-ed vec operations that can be performgeneralis-ed onAis the number of devisors ofK.
2.2. Theorems about generalisedvec operators
In this section we derive results concerning the generalised vec operators which are summarised in the following theorems.
(ii) Writinga"(a
Often Ais a squarenp]npmatrix. For such a matrix we have the following
theorem.
Theorem 2. Let A be anp]npmatrix so each submatrix is np]n, and D andabe as prescribed in Theorem 1. Then
3For a full discussion of commutation matrices and other zero}one matrices, see Magnus (1988), Magnus and Neudecker (1988).
2.3. Generalisedvecs of commutation matrices
Our future work will involve generalised vecs of commutation matrices.3
Consider the commutation matrix K
Gn which we write as KGn"
Infact we can express vec
GKGn in terms ofKGn as the following theorem shows
Theorem 4. vec
Using Theorem 4 we can obtain results for vec
GKGn from known properties of
the commutation matrix. In this way the following properties of vec
GKGn,
which we need in our future work, can be established:
Theorem 4 can also be used to establish useful matrix calculus results. Using this theorem and the result given by Magnus (1988, p. 44, Exercise 3.15) we have
that forAaG]pmatrix
vec(A?I
n)"(Ip?vecGKGn)vecA (1)
and
vec(I
n?A)"(vecGKGn?IG)vecA.
It follows that
Lvec(A?I n)
LvecA "Ip?(vecGKGn)@ (2)
and
Lvec(I n?A)
LvecA "(vecGKGn)@?IG.
3. Seemingly unrelated regression equations model with vector autoregressive disturbances
3.1. The model
We consider a system ofGlinear regression equations which we write as
y
1"X1d1#u1, F
y
G"XGdG#uG,
(3)
or more succinctly as
y"Xd#u
wherey"(y@
1,2,y@G)@,Xis the block diagonal matrix withXi in theith block
diagonal position, d"(d@
1,2,d@G)@ and u"(u@1,2,u@G)@. We assume that the
disturbances are subject to a vector autoregressive system of orderp. Letu
8t be
theG]1 vector containing thetth values of theGdisturbance. Then we have
u
8t#R1u8t~1#2#Rpu8t~p"e8t, t"1,2,n (4)
where each matrix R
j is a G]G matrix of unknown parameters and the e8t
are assumed to independently identify normally distributed random vectors
with mean 0 and a positive-de"nite covariance matrix R. We assume that
4The shifting matrixS
jis de"ned by
S
j"
A
0
F } 0
0 }
1 } }
} } }
} } }
} } }
0 1 0 2 0
B
When a givenn]mmatrixAis premultiplied byS
jthe rows ofAare shifted downjplaces and the
"rstjrows are replaced with null row vectors. For a discussion of the properties of shifting matrices see Turkington (1998b).
;"(u
1,2,uG),E"(e1,2,eG) and let ;~l denote the matrix ; but with
values laggedlperiods. Then we can write the disturbance system (4) as
;#;
~1R@1#2#;~pR@p"E
or
;#;
pR@"E (5)
where R is the G]Gp matrix R"(R
1,2,Rp) and ;p is the n]Gp matrix
;
p"(;~12;~p). In the application of asymptotic theory presample values
are replaced by zeros without a!ecting our results. Suppose we do this at the
start of our analysis. Then we can write
;
~j"Sj;, j"1,2,p
whereS
j is the appropriaten]nshifting matrix,4and
;
p"S(Ip?;),
whereS is then]np matrix given byS"(S
1,2,Sp). Taking the vec of both
sides of Eq. (5) we have
u#(R?I
where
So after this mathematical manoeuvring we can write our model as
y"Xd#u,
M(r)u"e,
e&N(0,R?I n).
3.2. Properties of the matricesN(r)andM(r)
The matricesN(r) and M(r) play a crucial role in the statistical analysis of
Consider the n]nsubmatrix ofN(r) in the (1, 1) position as typical. Letting
N
11 denote this matrix we have
N
which is clearly strictly lower triangular, a band matrix and a Toeplitz matrix. So if we write
ij isn]n, with these characteristics. Now if we write
then it follows that eachM
ii,i"1,2,Gisn]nlower triangular, band,
Toep-litz matrix with ones along its main diagonal whereas eachM
ij, iOjisn]n
strictly lower triangular and band and a Toeplitz matrix.
3.3. The derivativesLvecN(r)/LrandLe/Lr
Important derivatives are LvecN(r)/Lr and Le/Lr where r"vecR. These
derivatives bring generalised vecs into the analysis and are now derived. As N(r)"(R?I
plying Property (ii) of Section 2.3 we have
Le
Lr"KpG,G(IG?;@p). (6)
3.4. The parameters of the model,the log likelihood function and the scorevector
The parameters of the model are given byh"(d@r@l@)@wherel"vechRand
the log likelihood function apart from a constant is
l(h)"!n
2log detR!
1
2e@(R~1?In)e,
where in this function we set e equal to M(r)(y!Xd). The "rst and third
component of the score vector are
Ll
3.5. The Hessian matrix and information matrix
3.5.1. The Hessian matrixL2l/Lh Lh@
Several components of the Hessian matrix can be obtained by adapting the derivatives obtained by Turkington (1998a). They are listed here for conveni-ence:
L2l
Ld Ld@"!Xd{(R~1?In)Xd, (10)
L2l
Ld Ll@"!Xd{(R~1?ER~1)D, (11)
L2l
Ll Ll@"D@(R~1?R~1)
C
nIG2
2 !(IG?E@ER~1)
D
D. (12)The derivatives involving r are obtained each in turn: L2l/Ld Lr@: We derive
this derivative from Ll/Lr which from Eq. (9) we write as
Ll/Lr"!K
pG,G(R~1?IpG)vec;@pE. Using a product rule of matrix calculus
we have
Lvec;@ pE Ld "
Lvec;@ p
Ld (E?IpG)# Le
Ld(IG?;p).
But vec;@
p"Kn,pGB(y!Xd) soLvec;@p/Ld"!X@B@KpG,n and
Lvec;@ pE
Ld "!X@B@KpG,n(E?IpG)!Xd{(IG?;p).
Our derivative follows directly and is given by
L2l
Ld Lr@"X@B@KpG,n(ER~1?IpG)KG,pG#Xd{(R~1?;p)KG,pG
"X@B@(I
pG?ER~1)#Xd{(R~1?;p)KG,PG.
L2l/LrLl@: Again we derive this derivative fromLl/Lrwhich we now write as
Ll/lr"!K
pG,G(IG?;@pE)vecR~1. AsLvecR~1/Ll"!D@(R~1?R~1) it
fol-lows that our derivative is given byL2l/LrLl@"K
pG,G(R~1?;@pER~1)D. L2lrLr@: From Eqs. (9) and (6) we have L2l/LrLr@" !K
3.5.2. The information matrixI(h)"!plim(1/n)L2l/Lh Lh@
Clearly under appropriate assumptions plimE@E/n"R, plim;@
pE/n"0and
plimX@B@(I
pG?E)/n"0so our information matrix can be written as
I(h)"plim1
If the matrixXdoes not contain lagged values of dependent variables then plim
Xd{(I?;
p)/n"0 and for this special case the information matrix simpli"es to I(h)"plim1
Inverting the information matrix is straightforward. For the general case let
where M
p"In!;p(;@p;p)~1;@p,N"12(IG2#KGG) and ¸ is the
12G(G#1)]G2elimination matrix.
For the special case whereXis exogenous and contains no lagged dependent
variables
I(h)~1"
plimn
A
[Xd{(R~1?I
n)Xd]~1 0 0
0 (;@
p;p)~1?R 0
0 0 2¸N(R?R)N¸@/n
B
. (19)
4. Seemingly unrelated regression equations model with vector moving average disturbances
4.1. The model
In this section we assume that the disturbances of the model given by Eq. (3) are now subject to the moving average process
u
8t"e8t#R1e8t~1#2#Rpe8t~p.
Following a similar analysis to that of Section 3.1 we write the model as
y"Xd#u,
u"M(r)e, e&N(0,R?I
n)
.
Assuming invertability we write e"M(r)~1u.
It is the presence of the inverse matrixM(r)~1that makes the di!erentiation of
the log likelihood far more complicated for the case of moving average distur-bances but again the mathematics is greatly facilitated using generalised vecs.
Before we commence this di!erentiation it pays us to look at some of the
properties of M(r)~1, properties which we shall need in the application of our
asymptotic theory.
4.2. The matrixM(r)~1
Recall from Section 3.2 that if we write
M(r)"
A
M11 2 M1G
F F
M
then eachM
ii,i"1,2,G, is an]nlower triangular band Toeplitz matrix with
ones along its main diagonal whereas eachM
ij,iOj, is an]nmatrix which is
strictly lower triangular band and Toeplitz. Suppose we write
M(r)~1"
A
M11 2 M1G
F F
MG1 2 MGG
B
where each submatrix isn]n. The following theorem allows us to conclude that
eachMijhas similar characteristics asM
ij. That isMii,i"1,2,G, is a lower
triangular band matrix with ones down its main diagonal whereasMij,iOj, is
strictly lower triangular and band.
Theorem 5. SupposeAis anG]nGmatrix and let
along its main diagonal and each n]n matrices A
ij,iOj, is strictly lower
(upper) triangular. SupposeAis nonsingular and let
A~1"
A
ones as its main diagonal elements and each Aij,iOj, is also strictly lower
(upper) triangular.
Proof. We shall use mathematical induction to established the result for the lower triangular case. The upper triangular proof is then obtained by taking transposes.
where the submatrices are n x n with the characteristics prescribed by the
theorem. Then as DA
D"A
22!A21A~111A12 exists. Now A21A~111A12 is the product of lower
tri-angular matrices and as A
21 is strictly lower triangular this product is also
strictly lower triangular. It follows then that D is lower triangular with ones as
its main diagonal elements soDDD"1 andDis nonsingular. Let
A~12 "
A
A11 A12and from the properties of triangular matrices it is clear that the submatrices
Aijhave the required characteristics. Suppose now it is true for A
p, anp]np
p`1p`1is lower triangular with ones as main diagonal elements. Let
A~1p "
A
A11 2 A1p
F F
Ap1 2 App
B
where by assumptionA~1p exists and each of then]nsubmatricesAijhave the
desired characteristics. Consider
F"A
p`1p`1!B21A~1p B12
where B
21A~1p B12"RiRjAp`1iAijAp`1j is the sum of products of lower
tri-angular matrices and each of these products is infact strictly lower tritri-angular. It
follows that Fis lower triangular with ones as its main diagonal elements so
ExpandingB
21A~1p andA~1p B12 as we did above it is clearly seen that theBij's
have the required characteristics. h
Now consider
It follows from Theorem 5 that eachM
ii,i"1,2,G, is an]nupper triangular
matrix with ones as its main diagonal elements whereas each M
*+,iOj, is strictly upper triangular.
4.3. The derivativeLe/Lr
Just as in the analysis of the previous model we shall need the derivativeLe/Lr.
Write
4.4. The parameters of the model,the log likelihood function and the scorevector
The parameters of the model are given byh"(d@r@l@)@and the log likelihood
function apart from a constant is
lh)"!n
the score vector is given by
Ll
4.5. The Hessian matrixL2l/Lh Lh@
The components of the Hessian matrixL2l/Ld Ld@,L2l/Ld Ll@andL2l/Ll Ll@
are given by Eqs. (10)}(12) respectively but with XH in place of Xd. The
derivativeLl/LrLl@is obtained in much the same way as for the previous model.
We get
L2l
LrLl@"!KpG,G(IG?E@p)M(r)~1@(R~1?ER~1)D.
The last two components of the Hessian matrix namelyL2l/Ld Lr@andL2l/LrLr@
require more e!ort to obtain and draw heavily on the properties of the
generalis-ed vec of commutation matrices.
L2l/LdLr@: We start fromLl/Lrwhich we write asLl/Lr"J@(e?I
nG)Ce, where C"M(r)~1@(R~1?I
n). Using the backward chain rule of matrix calculus it
follows that
But a little work shows that
L(e?I
We now want to write this derivative in terms of commutation matrices. Using
Property (ii) of Section 2.31 we have that (e@?I
So we can now write
and using the product rule of matrix calculus
Again using the product rule
La(r)
gether with Eq. (21) and the properties of the commutation matrix allows us to write
La(r)
Lr "!J@[M(r)~1?a(r)]!J@(e?InG)M(r)~1@(R~1?In)M(r)~1. (26)
Substituting Eqs. (26) and (25) into Eq. (24) gives
L2l
Consider the"rst matrix on the right-hand side of Eq. (27). By Property (iii) of
Section 2.3 and Theorem 3, (a(r)@?I
pG2)Q"A@?IpG and using the properties
of the commutation matrix we can write this"rst matrix as
!K
pG,G(IG?Ep)M(r)~1@B@(IpG?A).
We have already seen thatJ@(I
nG?a(r))"(IpG?A@)Bso the second matrix on
the right-hand side of Eq. (27) is just the transpose of the"rst. Thus our"nal
expression for this derivative is
4.6. The information matrixI(h)"!plim(1/n)L2l/Lh Lh@
The work required to evaluate some of the probability limits associated with this matrix is contained in the appendix. Using the results of the appendix we can write the information matrix as
I(h)"
A
Idd Idr Idl
Ird I
rr Irl
Ild Ilr Ill
B
where
I
dd"plim
1
nXH@(R~1?In)XH,
Idr"plim1
nXH@(R~1?In)M(r)~1(IG?Ep)KG,pG"(Ird)@,
I
dl"0"(Ild)@Irl"0"(Ilr)@,
I
rr"plim
1
nKpG,G(IG?E@p)M(r)~1@(R~1?In)M(r)~1(IG?Ep)KG,pG,
I
ll"12D@(R~1?R~1)D.
For the special case whereX contains no lagged dependent variables
Idr"0"(Ird)@.
4.7. The Cramer}Rao lower boundI~1(h)
As I(h) is block diagonal inverting it presents little di$culty. Using the
property of commutation matrices thatK~1pG,G"K@
pG,G"KG,pG if we write
I~1(h)"plimn
A
Idd Idr Idl Ird Irr Irl Ild Ilr Ill
B
then
Idd"MXH@(R~1?I
n)XH!XH@Z[Z@(R?In)Z]~1ZXHN~1, (28)
Idr"(Ird)@"!IddXH@Z[Z@(R?I
n)Z]~1KG,pG, (29)
Idl"(Ild)@"0, (30)
Irr"K
pG,GMZ@(R?In)Z!Z@XH@[XH@(R~1?In)XH]~1XH@ZN~1KG,pG,
(32)
Ill"2¸N(R?R)N¸@/n, (33)
whereZ"(R~1?I
n)M(r)~1(IG?Ep).
The special case where X is exogenous and contains no lagged dependant
variables is simpler. Here
I~1(h)"
plimn
A
[XH@(R~1?I
n)XH]~1 0 0
0 K
pG,G[Z@(R?In)Z]~1KG,pG 0
0 0 2¸N(R?R)N¸@/n
B
(34)
5. Statistical inference from the score vectors and the information matrices
Having used our work on generalised vec operators to assist in the complic-ated matrix calculus needed to obtain the score vectors and information ma-trices we can avail ourselves of these latter concepts to derive statistical results
for our models. We do this"rst for the model with autoregressive disturbances.
5.1. Model with autoregressive disturbances
5.1.1. Ezcient estimation ofd
(i)Case where R is known: Consider the equation
yd"Xdd#e (35)
whereyd"M(r)XandXd"M(r)X. Clearly this equation satis"es the assump-tions of the SURE model without vector autoregressive disturbances. With
Rknown we can formydandXdand an asymptotically e$cient estimation of
dwould be the joint generalised least squares estimator (JGLSE) applied to Eq.
(35). That is
dK"[Xd{(RK~1?I
n)Xd]~1Xd{(RK~1?In)yd, (36)
whereRK"EK@EK@/n, vecEK"e( and e( is the OLS residual vector. AsdK is a BAN estimator we have Jn(dK!d)P$ N(0,<
1), where<1 is the Cramer}Rao lower
5The formal proof that this procedure does indeed lead to an asymptotically e$cient estimator may be obtained along the lines of a similar proof presented in Turkington (1998a).
the information matrix, both for the case whereXis exogenous and for the case
where X contains lagged dependent variables would be
IH
A
dThe asymptotic covariance matrix of dK would then be <
1" (plimXd{(R~1?I
n)Xd/n)~1.
(ii)Case where R is unknown: The estimatordK is not available to us in the more
realistic case where R is unknown. However, an asymptotically e$cient
es-timator fordmay be obtained from the following procedure.5
1. Apply JGLS toy"Xd#uignoring the vector autoregression to obtain
estimator dM say and the residual vector u("y!XdM. From u( form ;K where
The estimator dKK is asymptotically e$cient both for the case where X is
exogenous and for the case whereX contains lagged values of the dependent
variables. But as in the case of generalised least squares estimators in dynamic
linear regression models the e$ciency ofdKK di!ers in the two cases.
First consider the case where X is exogenous. As dKK is a BAN estimator
Jn(dKK!d)P$ N(0,<
2) where<2is the appropriate Cramer}Rao lower bound is
obtained from I(h)~1 given by Eq. (19). So we see that
<
2,<1"[plimXd{(R~1?In)Xd/n]~1. This means that the JGLSEdKK with
un-knownRis as asymptotically e$cient as the JGLSEdK with knownR. As in the
linear regression model not knowingRcost us nothing in terms of asymptotic
e$ciency.
Next consider the case whereXcontains lagged dependent variables. For this
caseI~1(h) is given by Eqs. (13)}(18) so the asymptotic covariance matrix ofdKK is
<
2,Idd"[plimXd{(R~1?Mp)Xd/n]~1.It is easily seen that now<~11 !<~12 is positive semi de"nite so<
2!<1 is also positive-semi de"nite.
The JGLSE dK that can be formed with known R is asymptotically more
e$cient than the JGLSEdKK with unknownR. Not knowingRnow costs us in
5.1.2. Maximum likelihood estimators as iterative joint generalised least squares estimation
Using the score vector given by Eqs. (7)}(9) it is possible to obtain an
interpretation of the maximum likelihood estimator (MLE) ofdas an interative
JGLSE. Returning to the score vector we see that Ll/Lr"0 gives
K
This interpretation of the MLE is clearly interative asRK still containsdK through
;
p whereasdK containsRK throughXd. But this interpretation clearly points to
the estimation procedure outlined above.
5.2. Model with moving average disturbances
We proceed as we did for the previous model.
5.2.1. Ezcient estimation ofd
(i)Case where R is known: Consider the equation
yH"XHd#e (38)
where yH"M(r)~1y and XH"M(r)~1X. Clearly this equation satis"es the assumption of the SURE model without vector moving average disturbances.
WithRknown we can formyHandXHand an asymptotically e$cient estimator
ofdwould be JGLSE obtained from Eq. (38). That is
dI"[XH@(RK~1?I
n)XH]~1XH@(RK~1?In)yH (39)
where RK"EK@EK/n, vecEK"e( and e( is the OLS residual vector. As dI is a BAN estimator we haveJn(dI!d)P$ N(O,<
1) where <1 is the Cramer-Rao lower
bound referring tod. Withrknown our unknown parameters would be (d@v@)@
and the information matrix, both for the case whereXis exogenous and for the
case whereXcontains lagged dependent variables would be
IH
A
dThe asymptotic covariance matrix of dK would then be
<
1(ii)"Case where R is unknown: The estimator(plimXH@(R~1?In)XH/n)~1.
dI is not available to us in the more
realistic case whereR is unknown. However, once a consistent estimatorr( is
obtained we can formXK H"M(r()~1X,y(H"M(r()~1y, and dII"[XK H@(RK~1?I
As with the autoregressive case the estimatordII is asymptotically e$cient both
for the case where X is exogenous and for the case whereX contains lagged
dependent variables, but the e$ciency of the estimator di!ers for the two cases.
Consider the case where X is exogenous. As dII is a BAN estimator
Jn(dII!d)P$ N(0,<
2) where<2 is the appropriate Cramer-Rao lower bound
obtained from I~1(h) given by Eq. (34). That is <
2,<1" [plimXH@(R~1?I
n)XH]~1. This means that the JGLSE dII with unknown
Ris as asymptotically e$cient as the JGLSEdI with known R. Not knowing
Rcosts us nothing in terms of asymptotic e$ciency.
Next consider the case whereXcontains lagged dependent variables. For this
caseI~1(h) is given by Eqs. (28)}(33) so the asymptotic covariance matrix ofdII is
now
<
2"Idd"plimnMXH@(R~1?In)XH!XH@Z[Z@(R?In)Z]~1ZXHN~1.
As with the autoregressive case it is easily seen that <
2!<1 is positive
semi-de"nite so now dII is less e$cient asymptotically than dI. Not knowing
Rnow cost us in terms of asymptotic e$ciency.
5.2.2. Maximum likelihood estimators as iterative joint generalised least-squares estimators?
Interestingly enough a similar interpretation of the MLE ofhas obtained in
the autoregressive case does not seem to be available to us for this case. Consider the score vector for this model. SolvingLl/Ld"0 andLl/Ll"0 gives
dI"(XH@(R~1?I
n)XH)~1XH@(R~1?I)yH (41)
and
RK"E@E/n,
as expected but problems arise when we attempt to extractrfrom the equation
Ll/Lr"0. Unlike the autoregressive case this equation is highly nonlinear inr,
involving as it doesM(r)~1. Notwithstanding this, Eq. (41) clearly points to the
estimatordII given by Eq. (40).
5.3. The Lagrangian multiplier test statistic for the hypothesisH 0:r"0
If the disturbances of the SURE model are not subject to vector
autoregres-sive or moving average processes then rather than using the estimatorsdKK and
dII given by Eqs. (37) and (40) we would use the JGLSE obtained from
y"Xd#u, namelydM"[X@(RK~1?I
n)X]~1X@(RK~1?I)y. It is of interest to us
then to develop a test statistic for the null hypothesis H
6The result is a generalisation of that obtained by Godfrey (1978) for the linear regression model. alternative H
A:rO0. The most amenable classical test statistic to our models is
the Lagrangian multiplier test (LMT) statistic which is given by
¹"1
where in forming hK we put r equal to the null vector and evaluate all other
parameters at the constrained MLEs, the MLEs we get fordandlafter we set
requal to the null vector. Asymptotically the constrained MLE fordis
equiva-lent todM.
Before we form this test statistic it should be noted that the LMT statistic is incapable of distinguishing between vector autoregressive disturbances and
vector moving average disturbances.6 With r"0,M(r)"I
nG,XH"Xd"X,
;"E,u"e,;
p"Epand plimE@pEp/n"plim;p@;p/n"Ip?R,Z"R~1?;p
so for both of the models before us
IrrD
It follows then that the LMT statistic for H
0:r"0 is the same for both models. The actual test statistic itself will depend on the case before us. We have seen
that for both modelsI(h) and thereforeIrr(h) di!ers depending on whetherXis
exogenous orXcontains lagged dependent variables. We consider each case in
turn.
First when X is exogenous for both models Irr(h)D
r/0"Ip?R~1?R. It
follows that for this case the LMT statistic is
whereu( is the constrained MLE residual vector, vec;K "u( and;K
p"S(Ip?;).
(An asymptotically equivalent test statistic would use the JGLSE residuals
formed fromdM). Under H
0,¹@has a limitings2distribution withpG2degrees of
freedom, so the upper tail of this distribution is used to obtain the appropriate critical region.
Second consider the more complicated case where X contains lagged
de-pendent variables. Here for both models we can write Irr(h)D
r/0"
write the LMT statistic as
¹
where u( is the constrained MLE residual vector, ;K "(vec
nu(@)@ and ;K
p"S(Ip?;K ).
6. Conclusion
This paper introduces generalised vec operators and presents some of their mathematical properties. Such operators should be of interest to econo-metricians working in time series models with vector autoregressive distur-bances or vector moving average disturdistur-bances. In our analysis use was made of the properties of the operators to facilitate the complicated matrix calculus
di!erentiation and asymptotic theory needed to derive the score vectors and the
Cramer}Rao lower bounds for the SURE model with such disturbances. With
these matrices in hand it was a relatively simple matter to demonstrate that the known statistical results concerning the linear regression model with autoregres-sive disturbances or moving average disturbances generalise to the more com-plicated models.
With no lagged dependent variables on the right-hand side of our equations JGLSEs given by Eqs. (37) and (40) where the unknown parameters of the
disturbances process are consistently estimated are as e$cient asymptotically as
the JGLSEs given by Eqs. (36) and (39) where these parameters are known. If lagged dependent variables are contained on the right-hand side of our
estimations. Not knowing these parameters costs us in terms of asymptotic
e$ciency just as it does in the linear regression case.
Also as in the linear regression case the LMT for H
0:r"0 can not distin-guished between vector autoregression and vector moving average disturbances as the LMT statistic is the same in both cases. This test statistic was derived
from the asymptotic score vector and Cramer}Rao lower bound both for the
case where our right-hand variables are exogenous and for the case where lagged dependent variables appear in our equations.
Acknowledgements
The author would like to thank his colleague Michael McAleer for pointing to the general nature of the operators discussed in this paper. Generous help is also acknowledged from two anonymous referees.
Appendix A. Probability limits associated with the information matrix of the model with moving average disturbances
A.1. plim1
i contains lagged dependent variables this probability limit will not be the
null matrix. Consider now
XH@B@(I
pG?A)"XH@[IG?S1@A,2,IG?S@pA]. (A.1)
We consider the"rst matrix in the right-hand side of Eq. (A.1) as typical and we
use the notation of Section 4.2 where we write
Then
XH@(I
G?S@1A)"
A
X@
1M11S@1A 2 X@1M1GS@1A
F F
X
GMG1S@1A 2 X@GMGGS@1A
B
(A.2)
Again take the matrix in the (1, 1) position of the right-hand side of Eq. (A.2) as typical. Now under our notation
A"(M
1vecER~1,2,MGvecER~1) (A.3)
so
X@
1M11S@1A"X@1M11S@1(M1vecER~1,2,MGvecER~1).
Now
X@
1M11S1M1vecER~1"X@1M11S1@M1(R~1?I)e"X@1M11S@1
]
A
+G i/1G + j/1
pijM
1jej
B
where R~1"MpijN. So in evaluating plimXH@B@(I
pG?A)/n we are typically
looking at plimX@
1M11S@1Mijej/n.
Now in Section 4.2 we saw that M
ii is upper triangular and Mij,iOj is
strictly upper triangular so S@
1Mii is strictly upper triangular. It follows that
M
11S@1M1jis strictly upper triangular and so plimX@1M11S@1Mijej/nis the null
vector even if X
1 contains lagged dependent variables. We conclude that
regardless of whether X contains lagged dependent variables or not
plimXH@B@(I
pG?A)/n"0.
A.2. plim[(1/n)L2lLrLr@] A.2.1. plim[(1/n)K
pG,G(IG?E@p)M(r)~1@B@(IpG?A)]
7SupposeAis wherem]n,Bp]qand letB"(b1,2,bp)@is wherebi{is theith row ofB. Then it is
The proof is left to the reader. this end consider
so by a property of the commutation matrix7
K
It follows then that the submatrix in the (1, 1) position of the matrix we are
considering is plim [1/n(I
The typical element of the matrix we have in hand is then
pijplime@
iS@iMklS@rMsjej/n. In Section 4.2 we saw that eachMiiis upper
triangu-lar whereas each M
ij,iOj, is strictly upper triangular. It follows from the
properties of shifting matrices thatS@
so the matrix is the quadratic form of our plim being the product of strictly upper triangular matrices is also strictly upper triangular. We conclude then that the probability limit of a typical element of our matrix is zero.
References
Bourbaki, N., 1958. Algebre Multilineaire, in Elements de Mathematique. Book II (Algebre). Herman, Paris.
Cartan, E., 1952. Geometrie des Espaces de Riemann, Gauthier-Villars, Paris.
Godfrey, L.G., 1978. Testing against general autoregressive and moving average error models when the regressions include lagged dependent variables. Econometrica 46, 1293}1302.
Greub, W.H., 1967. Multilinear Algebra. Springer, Berlin.
Magnus, J.R., 1988. Linear Structures. Oxford University Press, New York. Magnus, J.R., Neudecker, H., 1988. Matrix Di!erential Calculus. Wiley, New York
Marcus, M., 1973. Finite Dimensional Multilinear Algebra: Part I, Marcel Dekker, New York. Turkington, D.A., 1998a. E$cient estimation in the linear simultaneous equations model with vector
autoregressive disturbances. Journal of Econometrics 85, 51}74.