5.4 Joint Distribution Modelling of Vehicle Dynamic Parameters Using Copula
5.4.1 Concept of Copula
A copula is a joint distribution function with standard uniform univariate margins U[0,1] that links a stochastic multivariate relationship with pre-specified univariate marginal distribution functions. The copula theory can be employed to construct 2-dimensional probability distributions for any marginal distributions. Given two continuous random variables (π1, π2) with cumulative distribution functions (πΉ1(π₯1), πΉ2(π₯2)), a joint bivariate cumulative distribution function (CDF) πΉ(π₯1, π₯2) can be generated using the following expression:
πΉ(π₯1, π₯2) = πΆπ(π’1 = πΉ1(π₯1), π’2 = πΉ2(π₯2))β¦β¦β¦5-3 The function πΆ (copula) is a CDF joining marginal distributions πΉ1(π₯1), πΉ2(π₯2). In case of marginal distributions of continuous variables, by differentiating πΉ(π₯1, π₯2), the 2-dimensional copula density function can be obtained as:
π(π₯1, π₯2) = ππ[πΉ1(π₯1), πΉ2(π₯2)]π1(π₯1)π2(π₯2)β¦β¦β¦..β¦β¦β¦5-4 where ππ[πΉ1(π₯1), πΉ2(π₯2)] is the density of the copula and is obtained by computing the second partial derivative of the copula on the two variables:
ππ[πΉ1(π₯1), πΉ2(π₯2)] = π2πΆ
ππ’1ππ’2 and π’1 = πΉ1(π₯1), π’2 = πΉ1(π₯2)β¦β¦β¦.β¦..5-5 Because the dependence structures modelled through the copulas are separated from the univariate marginal distributions, an appropriate copula function is the one which suitably elucidates the dependence features of the data. Several techniques have been used in the literature to evaluate the degree of correlation between the random variables. There are two commonly used non-parametric methods of detecting correlation that are based on the concept of concordant and discordant pairs of observations are, Kendallβs Ο (Kendall, 1938) and Spearmanβs Οs (Spearman, 1904). Kendallβs Ο measure of dependence between the random variables (X1, X2) is defined as the probability of concordance minus the probability of discordance and is expressed as:
π(π1, π2) = π ((π1β πΜ)(π1 1β πΜ) > 0) β π ((π2 1β πΜ)(π1 1β πΜ) < 0)β¦β¦β¦.β¦.5-6 2 Where, (πΜ, π1 Μ) is an independent copy of (π2 1, π2). The second non-parametric measure of
dependence i.e. Spearmanβs rank correlation coefficient Οs is three times the difference between the probability of concordance and the probability of discordance:
ππ(π1, π2) = 3 (β ((π1β πΜ1)(π2β πΜ2) > 0) β β ((π1β πΜ1)(π2β πΜ2) < 0))β¦β¦β¦β¦5-7 where (π1, π2), (πΜ1, πΜ2) and (πΜ1, πΜ2) are the three independent identically distributed random vectors each with a common joint distribution function πΉ(. , . ) and margins πΉ1 and πΉ2. More specifically, let π΄1 and π΄2 denote the number of concordant and discordant pairs for π observations; and π (πΜ1)and π (πΜ2) denote the ranks of a pair of variables (π1, π2), Kendallβs π and Spearmanβs ππ can then be defined as
π = π΄1 β π΄2
0.5π(πβ1)β¦β¦β¦..β¦β¦β¦.5-8
ππ = β π (πΜ1π)π (πΜ2π)βπ(
π+1 2 )2 ππ=1
β(βππ=1π (πΜ1π)2βπ(π+12 )2)(βππ=1π (πΜ2π)2βπ(π+12 )2)
β¦β¦β¦..5-9
Besides Kendallβs π and Spearmanβs ππ measure of dependence, the traditional Pearsonβs product-moment correlation coefficient π that is a measure of linear dependence between π1 and π2 poses several limitations. Unlike the rank-based dependence measures, the Pearsonβs correlation coefficient is a parametric technique that largely depends upon the marginal distributions of the population. It performs better than rank-based measures only for bivariate elliptical distributions (including bivariate normal distribution). Therefore, considering the advantages of rank-based dependence measures that both are not dependent on the margins, Kendallβs π and Spearmanβs ππ are commonly used to characterize the dependence structures in the copula literature, rather than the traditional Pearsonβs correlation coefficient.
A rich set of copula types have been generated using the method of inversion, geometric methods and algebraic methods (Nelsen, 2006) and each of the copula is characterised by a different dependence structure π on the data. Also it is to be noted that Kendallβs π and Spearmanβs ππ are functions of the respective copula and the dependence parameter, and are not dependent on the marginal distributions. A brief overview of the copula functions, that includes the Gaussian copula, the Farlie-Gumbel-Morgenstern (FGM) copula and various Archimedean copulas like the Gumbel-Hougaard copula, the Frank copula, the Clayton copula, the Ali-Mikhail-Haq (AHM) copula and the Joe copula, and dependence structures of each copula type is presented in Table 5-3.
Table 5-3 Characteristics and association measures of bivariate copulas
Copula type Function πΆπ(π’, π£) π domain Kendallβs π and domain Spearmanβs ππ and domain
Gaussian β« β« 1
2πβ1 β π2
πβ1(π£)
ββ
πβ1(π’)
ββ
exp (βπ 2β 2ππ π‘ + π‘2
2(1 β π2) )ππ ππ‘ π π [β1,1]
2
πsinβ1π π π [β1,1]
6
πsinβ1(π 2) ππ π [β1,1]
F-G-M π’1π’2[1 + π(1 β π’1)(1 β π’2)] π π [β1,1]
2 9π π π [β2
9,2 9]
1 3π ππ π [β1
3,1 3] Gumbel
exp (β[(β ln π’1)π+ (β ln π’2)π]1βπ) π π [β1, β)
1 β1 π π π (0,1)
No closed form
Frank
β1
πln (1 +(exp(βππ’1) β 1)(exp(βππ’2) β 1)
exp(βπ) β 1 ) π π (ββ, β)
1 β4
π[1 β π·πΉ(π)]
π π (β1,1)
1 β12
π (π·1|π) β π·2(π) ππ π [β1,1]
Clayton
(π’1βπ+ π’2βπ β 1)β1 πβ π π (0, β) π π + 2 π π (0,1)
Complicated expression ππ π (β1,1)
A-M-H π’1π’2
1 β π(1 β π’1)(1 β π’2)
π π [β1,1] 3π β 2
3π β2(1 β π)2
3π2 ln(1 β π) π π (β0.182, 0.333)
[[12(1 + π)ππ{ln}(1
β π)
β 24(1 β π) ln(1
β π)]/π2]
β3(π + 12)
ππ π [β0.271, 0.478] π Joe 1 β [(1 β π’1)π+ (1 β π’2)π β (1 β π’1)π(1 β π’2)π]1βπ π π [1, β) 1 +4
ππ·π½(π) π π [0,1)
No simple form ππ π [0,1) π·πΉ(π) = π
ππβ« π‘π
exp(π‘)β1ππ‘, π = 1, 2
π
0 ; ππ{ln}(π₯) = β«1π₯1βπ‘ln π‘ππ‘
In this study, the selection process of copula functions for bivariate and trivariate probability analysis was performed intuitively (Chowdhary et al., 2011). This process resulted from designated acceptable dependence domains and nature of association of the data under consideration. A set of copula functions have also been employed to build the 2-D probability distribution by assessing suitable goodness-of-fit measures whereas for trivariate analysis, the elliptical Gaussian copula has been considered in this study. The trivariate Gaussian copula can be defined as
πΆπ¬(π’, π£, π€) = ππ¬[πβ1(π’), πβ1(π£), πβ1(π€)]β¦β¦β¦5-10 in which
ππ¬[πβ1(π’), πβ1(π£), πβ1(π€)] = β« β« β« 2π3 2β1|π¬|1/2 πβ1(π€)
ββ
πβ1(π£)
ββ ππ₯π (β1
2πππ¬β1π) ππ
πβ1(π’)
ββ
β¦β¦β¦. β¦β¦..5-11
where πΈ = [
1 π12 π13 π12 1 π23 π13 π23 1
] is the symmetrical correlation matrix with πππ π [β1,1]; π° = (w1, w2, w3)π represents the corresponding integral variables.
The primary argument for the choice of Gaussian copula stems from its capability to model all ranges of dependence (β1,1) for multidimensional case. It has a property that as the number of dimensions (n) in a Gaussian copula increases, the number of parameters in the multivariate density function also increases in the order of π2. With a view to model the general positive and negative dependence structures among the three microscopic traffic variables, the trivariate Gaussian copula can therefore be a convenient choice and is hence, employed in this study.
The most difficult problem in implementing copula theory is the estimation of unknown parameters of the copula. Some commonly used approaches that are mostly employed for estimating copula parameters are Inference from Margins (IFM) approach (Joe, 1997), maximum pseudo-likelihood method and approach based on the rank correlation (Genest &
Rivest, 1993). Another straightforward approach for estimating the dependence parameter resorts to the approach based on rank correlation. This approach utilises the relationship between the dependence parameter π and the two rank-based estimators Kendallβs π and Spearmanβs ππ. Kojadinivic & Yan (2010) proved that Kendallβs π performs significantly better than Spearmanβs ππ; also the expression of Kendallβs π is explicit for all copula families. Due to these advantages, inversion of rank correlation coefficient Kendallβs π method is widely used for estimating the dependence parameter π. This method is very simple and the marginal
distributions need not be modelled to obtain an estimate of π. In the current paper, parameters of bivariate Gaussian, FGM and Archimedean copulas are computed using the approach based on Kendallβs π while for the trivariate case, maximum pseudo-likelihood estimator is used.