2 The Non-Stationary Poisson BINAR(1) Model

(1)

THE NON-STATIONARY BIVARIATE INAR UNDER AN UNCONSTRAINED INTER CORRELATION STRUCTURE

YUVRAJSUNECHER^?

Department of Accounting, Finance and Economics, University of Technology Mauritius, Mauritius Email: [email protected]

NAUSHADMAMODEKHAN

Department of Economics and Statistics, University of Mauritius, Mauritius Email: [email protected]

VANDNAJOWAHEER

Department of Mathematics, University of Mauritius, Mauritius Email: [email protected]

SUMMARY

It is commonly observed in medical and financial studies that large volume of time series of count data are collected for several variates. The modelling of such time series and the estimation of parameters under such processes are rather challenging since these high dimensional time series are influenced by time-varying covariates that eventually render the data non-stationary. This paper considers the modelling of a bivariate integer-valued autoregressive (BINAR(1)) process where the innovation terms are distributed under non- stationary Poisson moments. Since the full and conditional likelihood approaches are cum- bersome in this situation, a Generalized Quasi-likelihood (GQL) approach is proposed to estimate the regression effects while the serial and time-dependent cross correlation effects are handled by method of moments. This new technique is assessed over several simulation experiments and the results demonstrate that GQL yields consistent estimates and is computationally stable since few non-convergent simulations are reported.

Keywords and phrases:Bivariate, Moving Average, Poisson, GQL, Non-Stationarity AMS Classification:65C60, 62J12, 62H12, 62J20, 62J10

?Corresponding author

c Institute of Statistical Research and Training (ISRT), University of Dhaka, Dhaka 1000, Bangladesh.

(2)

1 Introduction

Most of the univariate time series models that have been developed so far are based on the observation- driven (OD) integer-valued autoregressive (INAR(1)) process (See McKenzie, 1985; Al Osh and Alzaid, 1987; Kim and Park, 2008). In these models, it is relatively parsimonious to formulate the conditional likelihood estimating equations and used the iterative Newton-Raphson or Fisher-scoring algorithm to obtain the parameter estimates of the regression and serial correlation parameters. How- ever, in some important practical settings, the time series of counts are collected for two variables where apart from the serial correlation within each series, there exists as well a cross-correlation between the two series and this lead to a bivariate INAR(1) (BINAR(1)) time series set-up. In addition, these time series of counts are subject to some time-independent and time-varying covariates that induce non-stationary correlations. Thus, the modelling of the BINAR(1) time series is challenging.

In literature, Pedeli and Karlis (2009, 2011) developed the first BINAR(1) time series of counts (CBINAR1M1) which was a simple extension of two INAR(1) processes where the cross correlation between the two series was induced by the innovation terms of the two series. The innovation series were assumed to follow the bivariate Poisson or some bivariate Generalized Poisson distri- butions. Later, Pedeli and Karlis (2013a) extended the CBINAR1M1 to a full BINAR(1) process (FBINAR1M1) by considering a more complex cross correlation structure where apart from the bivariately distributed innovation terms, the cross correlation was also induced by relating the observation from the first variate with observation from the second variate via the binomial thinning oper- ator. However, both CBINAR1M1 and FBINAR1M1 were developed under stationary distributional or time-independent covariate assumptions. Recently, Mamode Khan et al. (2016b) introduced the first constrained BINAR(1) (CBINAR1M2) Poisson time series model under non-stationary set-up.

In their model, the mean and variance parameters of the bivariate innovation series were functions of time-dependent covariates, but so far, there is no full BINAR(1) process for non-stationary bivariate time series of counts such as the number of stock transactions of two companies operating in the same line of business, the number of daytime and nighttime accidents in an area, the number of day and night larceny acts and many others.

As for the inference procedures, Pedeli and Karlis (2009, 2011, 2013a) developed CMLE and method of moment approaches to estimate the mean and correlation parameters respectively in CBI- NAR1M1 and FBINAR1M1 models. Through simulation studies, CMLE was shown to provide estimates with lower bias than the moment based approach. Based on these results, Mamode Khan et al.

(2016b) developed the CMLE under time-dependent covariates for the non-stationary CBINAR1M2 and remarked that its log-information Hessian matrix was computationally intensive. As an alter- native, Mamode Khan et al. (2016b) proposed a generalized quasi-likelihood approach (GQL) that depends only on the notion of the marginal means, variances and joint covariances. This GQL equation was then solved iteratively using Newton-Raphson to obtain estimates of the mean parameters while a robust method of moment was implemented to estimate the non-stationary serial and cross correlations based on the findings of Sutradhar and Das (1999) and Jowaheer and Sutradhar (2005).

Besides, through simulation studies, GQL yielded equally efficient estimates as CMLE but far more efficient estimates than GMM (Qu and Lindsay, 2003) and GEE (Liang and Zeger, 1986). Thus, in this paper, we propose to develop a GQL approach for the full non-stationary Poisson BINAR(1)

(3)

process (FBINAR1M2) as well as CMLE and GMM adaptive equations. The performance of these methodologies are tested through a simulation study.

The outline of the paper is as follows: In Section 2, we introduce the FBINAR1M2 Poisson time series model and the marginal moments and joint covariances are derived under non-stationary distributional assumptions. In Section 3, we develop CMLE, GMM adaptive and GQL estimating functions to estimate the regression, serial and cross correlation parameters under the non-stationary FBINAR1M2 model which have not yet been explored in time series literature so far. This paper also compares the performance of CMLE, GMM adaptive and GQL on their basis of their statistical and computational efficiencies through a simulation study. The paper is concluded in Section 5.

2 The Non-Stationary Poisson BINAR(1) Model

Following Mamode Khan et al. (2016b), we extend the CBINAR1M2 to account for the cross- correlation between the innovation series and the cross-correlation between the current observation at timetwith the other variate previous lagged observations such that the FBINAR1M2 model is written as:

Y_t^[1]=ρ1∗Y_t−1^[1] +α_1,t−1∗Y_t−1^[2] +d^[1]_t (2.1) Y_t^[2]=ρ2∗Y_t−1^[2] +α2,t−1∗Y_t−1^[1] +d^[2]_t , (2.2) where0 < α_1,t−1,α_2,t−1, ρ1, ρ2 < 1 andρk represent the serial correlation andα_k,t−1are the cross-correlation that needs to be estimated. Y_t−1^[k] andd^[k]_t are independent. As for the thinning operation (Steutel and Van Harn, 1979),θ∗Y_t−1^[k] =P^Y

[k]

t−1

j=1 bj(θ)andbj(θ)is a binary variable with prob[b_j(θ) = 1] =θandprob[b_j(θ) = 0] = 1−θsuch that

θ∗Y_t−1^[k]|Y_t−1^[k], θ∼Binomial(Y_t−1^[k], θ)

In the above model, theY_t^[k] are subject to ap×1vector of covariatesx_t = [x_t1, x_t2, . . . , x_tp] andµ^[k]_t = exp(x^T_tβ^[k]),whereβ^[k]is thep×1vector of regression coefficients for thek^thseries.

From Equations (2.1) and (2.2),Y_t−1^[k] ∼ Poisson(µ^[k]_t−1). d^[1]_t andd^[2]_t follow the bivariate Poisson distribution (BivPoisson) (Kocherlakota and Kocherlakota, 2001) such that

P d^[1]_t =u, d^[2]_t =v

=e⁻ ^λ^[1]^t ^+λ^[2]^t ^−φ^t

(λ^[1]_t −φt)^u(λ^[2]_t −φt)^v u!v!

×

s

X

i=0

u i

v i

φⁱ_ti!h

λ^[1]_t −φt

λ^[2]_t −φt

i−i

(2.3)

and following equation (2.3),d^[k]_t ∼Poisson(λ^[k]_t ), whereλ^[1]_t = (µ^[1]_t −ρ1µ^[1]_t−1−α1,t−1µ^[2]_t−1), λ^[2]_t = (µ^[2]_t −ρ2µ^[2]_t−1−α_2,t−1µ^[1]_t−1),s= min(u, v),λ^[1]_t , λ^[2]_t >0, andφt∈[0,min(λ^[1]_t , λ^[2]_t )].

(4)

φtdenotes the covariance between the two error terms, such that

Cov d^[1]_t , d^[2]

t⁰

=







φt it=t⁰, 0 ift6=t⁰. and alternatively

φt=ρ12,t

q λ^[1]_t

q λ^[2]_t ,

whereρ12,t denotes the bivariate cross-correlation between the innovation seriesd^[1]_t andd^[2]_t and 0≤ρ12,t≤1. From the above assumption, it can be proved that

E(Y_t^[k]) =V ar(Y_t^[k]) =µ^[k]_t Similarly, the lag-hcorrelation for thek^thseries is

Corr Y_t^[k], Y_t+h^[k]

=ρ^h_k q

µ^[k]_t q

µ^[k]_t+h

(2.4)

As for the joint covarianceσt,12=Cov(Y_t^[1], Y_t^[2]), from Equations (2.1) and (2.2)

E(Y_t^[1]Y_t^[2]) =ρ₁ρ₂E[Y_t−1^[1]Y_t−1^[2]] +α_1,t−1α_2,t−1E[Y_t−1^[1]Y_t−1^[2]] +α_2,t−1ρ₁µ^[1]_t−1+α_2,t−1ρ₁(µ^[1]_t−1)² +ρ₁µ^[1]_t−1µ^[2]_t −ρ₁ρ₂µ^[1]_t−1µ^[2]_t−1−ρ₁α_2,t−1(µ^[1]_t−1)²+α_1,t−1ρ₂µ^[2]_t−1

+α_1,t−1ρ₂(µ^[2]_t−1)²+α_1,t−1µ^[2]_t−1µ^[2]_t −α_1,t−1ρ₂(µ^[2]_t−1)²−α_1,t−1α_2,t−1µ^[1]_t−1µ^[2]_t−1 +ρ₂µ^[1]_t µ^[2]_t−1−ρ₂ρ₁µ^[1]_t−1µ^[2]_t−1−α_1,t−1ρ₂(µ^[2]_t−1)²+α_2,t−1µ^[1]_t−1µ^[1]_t

−ρ₁α_2,t−1(µ^[1]_t−1)²−α_1,t−1α_2,t−1µ^[1]_t−1µ^[2]_t−1+E(d^[1]_t d^[2]_t ) and by simplifyingσ_t,12, a first-order difference equation is obtained, where

σt,12= (ρ1ρ2+α_1,t−1α_2,t−1)Cov Y_t−1^[1], Y_t−1^[2]

+ρ12,t

q λ^[1]_t

q λ^[2]_t +ρ1α2,t−1µ^[1]_t−1+α1,t−1ρ2µ^[2]_t−1

For the other off-diagonal covariance,

E(Y_t^[1]Y_t+1^[2]) =ρ₂Cov Y_t^[1], Y_t^[2]

+α_2,tCov Y_t^[1], Y_t^[1]

+µ^[1]_t µ^[2]_t+1 (2.5) E(Y_t^[1]Y_t+2^[2]) =ρ₂Cov Y_t^[1], Y_t+1^[2]

+α_2,t+1Cov Y_t^[1], Y_t+1^[1]

+µ^[1]_t µ^[2]_t+2 (2.6) (See detailed derivation for Equations (2.5) and (2.6) in Appendix A).

E(Y_t^[1]Y_t+h^[2]) =ρ₂Cov Y_t^[1], Y_t+h−1^[2]

+α_2,t+h−1Cov Y_t^[1], Y_t+h−1^[1]

+µ^[1]_t µ^[2]_t+h (2.7)

(5)

Using Equations(2.5) – (2.7), we have Cov Y_t^[1], Y_t+h^[2]

=ρ2Cov Y_t^[1], Y_t+h−1^[2]

+α_2,t+h−1Cov Y_t^[1], Y_t+h−1^[1]

=ρ^h₂h

ρ1ρ2+α_1,t−1α_2,t−1

Cov Y_t−1^[1], Y_t−1^[2]

+ρ12,t

q λ^[1]_t

q λ^[2]_t

+ρ1α_2,t−1µ^[1]_t−1+α_1,t−1ρ2µ^[2]_t−1i +µ^[1]_t

h−1

X

j=0

ρ^j₂ρ^h−j−1₁ α_{2,t+h−1−j}

and similarly,

Cov Y_t+h^[1], Y_t^[2]

=ρ^h₁h

ρ1ρ2+α_1,t−1α_2,t−1

Cov Y_t−1^[1], Y_t−1^[2]

+ρ12,t

q λ^[1]_t

q λ^[2]_t

+ρ1α_2,t−1µ^[1]_t−1+α_1,t−1ρ2µ^[2]_t−1i +µ^[2]_t

h−1

X

j=0

ρ^j₁ρ^h−j−1₂ α_{1,t+h−1−j}.

Under time-independent covariates assumption,σ_t,12is reduced to

σ12= ρ1α2,1µ^[1]₁ +α1,1ρ2µ^[2]₁ +ρ12

q λ^[1]₁

q λ^[2]₁ (1−ρ1ρ2−α1,1α2,1)

Cov Y_t^[1], Y_t+h^[2]

=ρ2Cov Y_t^[1], Y_t+h−1^[2]

+α2,1Cov Y_t^[1], Y_t+h−1^[1]

=ρ^h₂σ12+α2,1µ^[1]₁

h−1

X

j=0

ρ^j₂ρ^h−j−1₁

Cov Y_t+h^[1], Y_t^[2]

=ρ1Cov Y_t+h−1^[1] , Y_t^[2]

+α1,1Cov Y_t+h−1^[1] Y_t^[1]

=ρ^h₁σ12+α1,1µ^[2]₁

h−1

X

j=0

ρ^j₁ρ^h−j−1₂

whereα_1,1andα_2,1are the stationary cross-dependence terms in Equations (2.1) and (2.2). Note that these stationary cross-covariance expressions have the same formulation as Pedeli and Karlis (2013a).

3 Estimation of Parameters

This section reviews the estimation methodologies employed in estimating the regression, serial- and cross-correlation parameters. For the CBINAR1M1 and FBINAR1M1, Pedeli and Karlis (2009, 2011, 2013a) developed the CMLE approach and method of moments approach but as mentioned, these two models were designed under stationary assumptions and yet the CMLE was computationally intensive. Mamode Khan et al. (2016b) recently developed the GQL estimation technique for the non-stationary CBINAR1M2 where through simulation studies, it was noted that CMLE and GQL yielded equally efficient estimates asymptotically while far more efficient estimates than

(6)

GMM. Besides, GQL is more parsimonious than CMLE since it depends only on the notion of the marginal and joint moments. However, CMLE, GMM and GQL have not yet been developed for the FBINAR1M2 model. In the subsections that follow, these estimation approaches are derived in details and their merits and drawbacks are also highlighted.

3.1 CMLE

The conditional density for the FBINAR1M2 model is expressed as (Pedeli and Karlis, 2013a):

f1(k) =

k

X

j1=0

y_t−1^[1]

j₁

y^[2]_t−1 k−j₁

ρ^j₁¹(1−ρ1)^y^[1]^t−1^−j¹α^k−j_1,t−1¹(1−α_1,t−1)^y^[2]^t−1^−k+j¹

f2(s) =

s

X

j2=0

y_t−1^[2]

j₂

y_t−1^[1]

s−j₂

ρ^j₂²(1−ρ2)^y^[2]^t−1^−j²α^s−j_2,t−1²(1−α_2,t−1)^y^t−1^[1] ^−s+j²

and a bivariate distribution of the innovation terms of the formf₃(u, v) =P d^[1]_t =u, d^[2]_t =v . Then, the conditional density is written as (Pedeli and Karlis, 2013a)

f y^[1]_t , y^[2]_t

y^[1]_t−1, y_t−1^[2] , θ

=

g1

X

k=0 g2

X

s=0

f1(k)f2(s)f3 y_t^[1]−k, y^[2]_t −s

=

g1

X

k=0 g2

X

s=0

hX^k

j1=0

y^[1]_t−1 j1

y^[2]_t−1 k−j1

ρ^j₁¹(1−ρ1)^y^[1]^t−1^−j¹α^k−j_1,t−1¹(1−α_1,t−1)^y^[2]^t−1^−k+j¹i

× ^s

X

j₂=0

y_t−1^[2]

j2

y_t−1^[1]

s−j2

ρ^j₂²(1−ρ2)^y^[2]^t−1^−j²α_2,t−1^s−j² (1−α2,t−1)^y^[1]^t−1^−s+j²

×e^−(µ^[1]^t ^−ρ¹^µ^[1]^t−1^−α^1,t−1^µ^[2]^t−1^+µ^[2]^t ^−ρ²^µ^[2]^t−1^−α^2,t−1^µ^[1]^t−1^−φ^t⁾

×

min(k,s)

X

m=0

h

µ^[1]_t −ρ1µ^[1]_t−1−α_1,t−1µ^[2]_t−1−φt

y_t^[1]−k−m

× µ^[2]_t −ρ2µ^[2]_t−1−α_2,t−1µ^[1]_t−1−φt

y_t^[2]−s−m

φ^m_t i

/[(y_t^[1]−k−m)!(y_t^[2]−s−m)!m!

whereθ = ρk, αk,t−1, φt, β^[k]0

is the vector of unknown parameters,g1 = min(y_t^[1], y_t−1^[1] )and g2= min(y^[2]_t , y^[2]_t−1).

The conditional likelihood function is then given by

L(θ|y) =

T

Y

t=1

f y_t^[k]|y^[k]_t−1, θ

(7)

for some initial values ofθand by maximizing the conditional likelihood function logL(θ|y) = log

T

X

t=1

f y_t^[k]|y_t−1^[k] , θ

the following expressions are obtained:

Differentiating with respect toρ₁, we have

∂logL(θ|y)

∂ρ1

=

T

X

t=1

" g₁

X

k=0 g₂

X

s=0

hf₂(s)f₃ y_t^[1]−k, y^[2]_t −s∂f₁(k)

∂ρ1

+f1(k)f2(s) ∂

∂ρ₁f3(y_t^[1]−k, y_t^[2]−s)ihX^g¹

k=0 g2

X

s=0

f(y_t^[k]|y_t−1^[k] , θ)i−1# , (3.1) where

∂f₁(k)

∂ρ1

=

k

X

j₁=0

y_t−1^[1]

j1

y^[2]_t−1 k−j1

α^k−j_1,t−1¹(1−α_1,t−1)^y^[2]^t−1^−k+j¹

×h

j1ρ^j₁¹⁻¹(1−ρ1)^y^t−1^[1] ^−j¹−ρ^j₁¹(y_t−1^[1] −j1)(1−ρ1)^y^t−1^[1] ^−j¹⁻¹i and

∂

∂ρ1

f3(y^[1]_t −k, y_t^[2]−s)

=

min(k,s)

X

m=0

e^−(µ^[1]^t ^−ρ¹^µ^[1]^t−1^−α^1,t−1^µ^[2]^t−1^+µ^[2]^t ^−ρ²^µ^[2]^t−1^−α^2,t−1^µ^[1]^t−1^−φ^t⁾ (y^[1]_t −k−m)!(y^[2]_t −s−m)!m!

×h

µ^[1]_t−1(µ^[1]_t −ρ₁µ^[1]_t−1−α_1,t−1µ^[2]_t−1−φ_t)^y^[1]^t ^−k−m

×(µ^[2]_t −ρ₂µ^[2]_t−1−α_2,t−1µ^[1]_t−1−φ_t)^y^t^[2]^−s−mφ^m_t

−µ^[1]_t−1(y^[1]_t −k−m)(µ^[1]_t −ρ1µ^[1]_t−1−α_1,t−1µ^[2]_t−1−φt)^y^[1]^t ^−k−m−1

×(µ^[2]_t −ρ₂µ^[2]_t−1−α_2,t−1µ^[1]_t−1−φ_t)^y^t^[2]^−s−mφ^m_t i and the double derivative with respect toρ1is

T

X

t=1

∂²logL(θ|y)]

∂ρ²₁

=

T

X

t=1

" Pg1

k=0

Pg2

s=0f2f3∂²f₁

∂ρ²₁ +f2∂f₁

∂ρ₁

∂f₃

∂ρ₁ +f1f2∂²f₃

∂ρ²₁ +f2∂f₁

∂ρ₁

∂f₃

∂ρ₁

Pg₁ k=0

Pg₂

s=0f(y_t^[k]|y^[k]_t−1, θ)

− Pg₁

k=0

Pg₂

s=0 f2f3∂f₁

∂ρ1 +f1f2∂f₃

∂ρ1

2

Pg₁ k=0

Pg₂ s=0

f(y^[k]_t |y_t−1^[k] , θ) ²

#

, (3.2)

(8)

where

∂²f1(k)

∂ρ²₁ =

k

X

j1=0

y^[1]_t−1 j1

y_t−1^[2]

k−j1

α^k−j_1,t−1¹(1−α_1,t−1)^y^t−1^[2] ^−k+j¹

×h

j₁(j₁−1)ρ^j₁¹⁻²(1−ρ₁)^y^[1]^t−1^−j¹−j₁ρ^j₁¹⁻¹(y_t−1^[1] −j₁)(1−ρ₁)^y^[1]^t−1^−j¹⁻¹

−j₁ρ^j₁¹⁻¹(y_t−1^[1] −j₁)(1−ρ₁)^y^[1]^t−1^−j¹⁻¹

+ρ^j₁¹(y_t−1^[1] −j1)(y^[1]_t−1−j1−1)(1−ρ1)^y^[1]^t−1^−j¹⁻²i

and

∂²

∂ρ²₁f₃(y^[1]_t −k, y_t^[2]−s) =

min(k,s)

X

m=0

×[−µ²_t−1^[1](µ^[1]_t −ρ₁µ^[1]_t−1−α_1,t−1µ^[2]_t−1−φ_t)^y^[1]^t ^−k−m

×(µ^[2]_t −ρ2µ^[2]_t−1−α_2,t−1µ^[1]_t−1−φt)^y^[2]^t ^−s−mφ^m_t

+µ²_t−1^[1](y^[1]_t −k−m)(µ^[1]_t −ρ1µ^[1]_t−1−α1,t−1µ^[2]_t−1−φt)^y^[1]^t ^−k−m−1

−µ²_t−1^[1](y^[1]_t −k−m)(µ^[1]_t −ρ1µ^[1]_t−1−α1,t−1µ^[2]_t−1−φt)^y^[1]^t ^−k−m−1

×(µ^[2]_t −ρ₂µ^[2]_t−1−α_2,t−1µ^[1]_t−1−φ_t)^y^[2]^t ^−s−mφ^m_t +µ²_t−1^[1](y^[1]_t −k−m)(y^[1]_t −k−m−1)

×(µ^[1]_t −ρ₁µ^[1]_t−1−α_1,t−1µ^[2]_t−1−φ_t)^y^[1]^t ^−k−m−2

×(µ^[2]_t −ρ2µ^[2]_t−1−α_2,t−1µ^[1]_t−1−φt)^y^[2]^t ^−s−mφ^m_t ].

The above derivation confirms that the double derivative for CMLE is computationally intensive under non-stationary assumptions. Differentiating with respect toρ2, we have

T

X

t=1

∂logL(θ|y)

∂ρ2

=

T

X

t=1

^g1

X

k=0 g₂

X

s=0

f y_t^[k]|y_t−1^[k] , θ −1 ^g1

X

k=0 g₂

X

s=0

f1(k)f3(y^[1]_t −k, y_t^[2]−s)∂f₂(s)

∂ρ2

+f1(k)f2(s) ∂

∂ρ2

f3(y^[1]_t −k, y_t^[2]−s)

,

where

∂f2(s)

∂ρ2

=

s

X

j₂=0

y_t−1^[2]

j2

y_t−1^[1]

s−j2

α^s−j_2,t−1²(1−α2,t−1)^y^[1]^t−1^−s+j² h

j₂ρ^j₂²⁻¹(1−ρ₂)^y^[2]^t−1^−j²−ρ^j₂²(y^[2]_t−1−j₂)(1−ρ₂)^y^[2]^t−1^−j²⁻¹i

(9)

and

∂

∂ρ₂f3(y^[1]_t −k, y_t^[2]−s) =

min(k,s)

X

m=0

e^−(µ^[1]^t ^−ρ¹^µ^[1]^t−1^−α^1,t−1^µ^[2]^t−1^+µ^[2]^t ^−ρ²^µ^[2]^t−1^−α^2,t−1^µ^[1]^t−1^−φ^t⁾ (y_t^[1]−k−m)!(y^[2]_t −s−m)!m!

×[µ^[2]_t−1(µ^[1]_t −ρ₁µ^[1]_t−1−α_1,t−1µ^[2]_t−1−φ_t)^y^[1]^t ^−k−m

×(µ^[2]_t −ρ2µ^[2]_t−1−α_2,t−1µ^[1]_t−1−φt)^y^t^[2]^−s−mφ^m_t

−µ^[2]_t−1(y^[2]_t −s−m)(µ^[1]_t −ρ₁µ^[1]_t−1−α_1,t−1µ^[2]_t−1−φ_t)^y^t^[1]^−k−m

×(µ^[2]_t −ρ2µ^[2]_t−1−α_2,t−1µ^[1]_t−1−φt)^y^t^[2]^−s−m−1φ^m_t ].

Differentiating with respect toα1,t−1, we have

T

X

t=1

∂logL(θ|y)

∂α_1,t−1 =

T

X

t=1

^g¹ X

k=0 g2

X

s=0

f(y^[k]_t |y^[k]_t−1, θ) ⁻¹ ^g¹

X

k=0 g2

X

s=0

f2(s)f3(y^[1]_t −k, y_t^[2]−s)∂f1(k)

∂α_1,t−1

+f1(k)f2(s) ∂

∂α_1,t−1f3(y_t^[1]−k, y^[2]_t −s)

, where

∂f₁(k)

∂α_1,t−1 =

k

X

j₁=0

y^[1]_t−1 j1

y_t−1^[2]

k−j1

ρ^j₁¹(1−ρ1)^y^[1]^t−1^−j¹h

(k−j1)α^k−j_1,t−1¹⁻¹(1−α1,t−1)^y^t−1^[2] ^−k+j¹

−α^k−j_1,t−1¹(y^[2]_t−1−k+j1)(1−α_1,t−1)^y^[2]^t−1^−k+j¹⁻¹i and

∂

∂α_1,t−1f3(y^[1]_t −k, y_t^[2]−s)

=

min(k,s)

X

m=0

×

µ^[2]_t−1(µ^[1]_t −ρ₁µ^[1]_t−1−α_1,t−1µ^[2]_t−1−φ_t)^y^t^[1]^−k−m

−µ^[2]_t−1(y_t^[1]−k−m)(µ^[1]_t −ρ₁µ^[1]_t−1−α_1,t−1µ^[2]_t−1−φ_t)^y^t^[1]^−k−m−1

×(µ^[2]_t −ρ₂µ^[2]_t−1−α_2,t−1µ^[1]_t−1−φ_t)^y^[2]^t ^−s−mφ^m_t

. Differentiating with respect toα_2,t−1, we have

T

X

t=1

∂logL(θ|y)

∂α_2,t−1 =

T

X

t=1

^g1

X

k=0 g₂

X

s=0

f(y_t^[k]|y_t−1^[k] , θ) −1 ^g1

X

k=0 g₂

X

s=0

f1(k)f3(y_t^[1]−k, y^[2]_t −s)∂f2(s)

∂α_2,t−1

+f1(k)f2(s) ∂

∂α_2,t−1f3(y_t^[1]−k, y^[2]_t −s)

,

(10)

where

∂f₂(s)

∂α2,t−1

=

k

X

j₂=0

y^[2]_t−1 j2

y^[1]_t−1 s−j2

ρ^j₂²(1−ρ₂)^y^[2]^t−1^−j²h

(s−j₂)α^s−j_2,t−1²⁻¹(1−α_2,t−1)^y^[1]^t−1^−s+j²

−α_2,t−1^s−j² (y^[1]_t−1−s+j2)(1−α_2,t−1)^y^[1]^t−1^−s+j²⁻¹i

and

∂

∂α_2,t−1f3(y_t^[1]−k, y^[2]_t −s)

=

min(k,s)

X

m=0

e^−(µ^[1]^t ^−ρ¹^µ^[1]^t−1^−α^1,t−1^µ^[2]^t−1^+µ^[2]^t ^−ρ²^µ^[2]^t−1^−α^2,t−1^µ^[1]^t−1^−φ^t⁾ (y_t^[1]−k−m)!(y_t^[2]−s−m)!m!

×h

µ^[1]_t−1(µ^[1]_t −ρ1µ^[1]_t−1−α_1,t−1µ^[2]_t−1−φt)^y^t^[1]^−k−m

−µ^[1]_t−1(y_t^[2]−s−m)(µ^[1]_t −ρ1µ^[1]_t−1−α1,t−1µ^[2]_t−1−φt)^y^[1]^t ^−k−m

×(µ^[2]_t −ρ₂µ^[2]_t−1−α_2,t−1µ^[1]_t−1−φ_t)^y^[2]^t ^−s−m−1φ^m_t i .

Differentiating with respect toφ_t, we have

T

X

t=1

∂logL(θ|y)

∂φ_t =

T

X

t=1

Pg1

k=o

Pg2

s=0f1(k)f2(s)_∂φ^∂

tf3(y^[1]_t −k, y_t^[2]−s) Pg₁

k=0

Pg₂

s=0f(y^[k]_t |y_t−1^[k] , θ)

,

where

∂

∂φt

f₃(y^[1]_t −k, y_t^[2]−s) =

min(k,s)

X

m=0

×h

(µ^[1]_t −ρ1µ^[1]_t−1−α1,t−1µ^[2]_t−1−φt)^y^[1]^t ^−k−m

×(µ^[2]_t −ρ2µ^[2]_t−1−α2,t−1µ^[1]_t−1−φt)^y^[2]^t ^−s−mφ^m_t +m(µ^[1]_t −ρ₁µ^[1]_t−1−α_1,t−1µ^[2]_t−1−φ_t)^y^[1]^t ^−k−m

×(µ^[2]_t −ρ2µ^[2]_t−1−α_2,t−1µ^[1]_t−1−φt)^y^[2]^t ^−s−mφ^m−1_t

−(y^[1]_t −k−m)(µ^[1]_t −ρ₁µ^[1]_t−1−α_1,t−1µ^[2]_t−1−φ_t)^y^[1]^t ^−k−m−1

−(y^[2]_t −s−m)(µ^[1]_t −ρ1µ^[1]_t−1−α1,t−1µ^[2]_t−1−φt)^y^[1]^t ^−k−m

×(µ^[2]_t −ρ₂µ^[2]_t−1−α_2,t−1µ^[1]_t−1−φ_t)^y^[2]^t ^−s−m−1φ^m_t i .