Models for spatial correlations

Chapter 6 Spatial distribution of malaria problem in three regions of

6.2 Models for spatial correlations

Non-Gaussian spatial problems may be formally analyzed in the context of generalized linear mixed models (GLMM). Specification of the likelihood of the random variable ?()) is required where ) generally denotes the location of the observation is made. As in classical generalized linear models (GLMs), there is a canonical parameter corresponding to the distribution, which is normally a function of the location parameter via the link function %(. ) for the distribution.

This function is assumed to be linear in the explanatory variables. In the classical formulation of GLMs containing only fixed effects, %(m) = Oo, where O is the matrix of explanatory variables (Berridge and Crouchley, 2011, Zuur et al., 2009, Zurr et al., 2007, Fox, 2008, Madsen and Thyregod, 2010). To incorporate a spatial process, we assume ?()|9) is conditionally independent for any location ) with conditional mean }\?())|9^ = m()). The parameter 9 is used to define the distribution of ). Then, the spatially correlated random effect is incorporated into the linear predictor as

%(m) = /o + ü9 (6.1) where O and are the design matrices. The error term accommodates over- dispersion relative to the mean-variance relationship implied by the distribution under consideration. The random effect at location

)

,

9 ~ Ü$(0, Σ

_α

₍ θ ₎₎

_and

_{ý~Ü$(0, j}

_Ç) _,

_with _spatial correlation is parameterized by θ_inΣ_α₍θ₎ (Schabenberger and Gotway, 2005). Note that ) is

121 just one location and C= (), . . . , )₃)^V denotes a vector of - locations with variance-covariance matrix Ʃ.

Spatial dependence may be represented by a range of functions (Hengl, 2007).

To describe spatial correlation of observations, there are three major functions used in geostatistics. These major functions are the correlogram, the covariance, and the semivariogram. Semivariogram is also more simply called the variogram. In geostatistics, the variogram is a key function and is used to fit a model for the spatial correlation in the data. The model which is obtained using the variogram is used in kriging estimation procedures, a method which was first used to minimize (Goovaerts, 1997). Moreover, variogram models are also used to understand maximum distances of spatial autocorrelation which can further be used in construction of search parameters for different interpolation techniques. A variogram represents both structural and random aspects of the data under consideration. The structural part of the variogram model is represented by the range of a variogram. Furthermore, the variogram values increase with increases in the distance of separation until it reaches the maximum at a distance known as the “range”. To develop the variogram, assume m()) is a constant, that is constant mean m()), and define

$E()) − ())F= 2G() − )). (6.2)

In statement (6.2), the variance of

)

and

)

is through their difference

)

− )

and the process which satisfies this property is called intrinsically stationary.

The function 2G(. ) is called the variogram and G(. ) the semivariogram.

The other concept here is isotropy. Suppose the process is intrinsically stationary with semivariogram G(H), H ∈ ^ñ. If G(ℎ) = G(‖ℎ‖) for some function G, i.e. if the semivariogram depends on its vector argument H only through its length ‖ℎ‖, then the process is isotropic. Therefore, a process which is both intrinsically stationary and isotropic is also called homogeneous. Isotropic

122 processes are convenient to deal with because there are a number of widely used parametric forms for G(ℎ). Using semivariance G( ) for interval distance class , lag distance interval , : (nugget variance) ≥ 0, : (structural variance)

≥ : and É is the range parameter, some of the examples are:

1. Linear:

G( ) =J0 ( = 0, :+ : ( >1.L

Here c and c are positive constants. The function tends to ∞ as t→∞ and so does not correspond to a stationary process.

2. Spherical:

G( ) = NO P

OQ 0 ( = 0, :+ : ©3

2 É −1 2 äÉæ

R® ( 0< ≤ É, :+ : ( ≥ É. UOV

This is valid if # = 1; 2 or 3, but for higher dimensions it fails the non- positive-definiteness condition. It is a convenient form because it increases from a positive value c when is small, levelling at the constant c+ c at = É. This is of the "nugget/range/ sill" form which is often considered a realistic and interpretable form for a semivariogram.

3. Exponential:

G( ) =J0 ( = 0, :+ :(1 − ^,^⁄) ( >1.L

This is simpler in functional form than the spherical case (and valid for all d) but without the finite range of the spherical form. The parameter has a similar interpretation to the spherical model however, of fixing the scale of variability.

123 4. Gaussian:

G( ) = © 0 ( = 0, :+ :(1 − ^,^Õ^⁄^Õ) ( > 1.®

5. Exponential-power form:

G( ) =J 0 ( = 0, :+ :(1 − ^,|^{⁄ |}^X) ( > 1.L

Here 0 < ≤ 2. This form generalizes both the exponential and Gaussian forms, and forms the basis for the families of spatial covariance functions introduced by (Sacks et al., 1989). However, in generalizing the results from one dimension to higher dimensions, these authors used a product form of covariance function in preference to constructions based on isotropic processes (Gaetan and Guyon, 2010).

6. Relational quadratic:

G( ) = © 0 ( = 0, :+ : (1 + ^Õ^⁄) ( > 1.®

7. Wave:

G( ) = ©0 ( = 0, :+ :E1 −sin ()F ( >1.®

8. Power law

G( ) =J0 ( = 0, :+ : ^Y ( > 1.L

Non-positive-definiteness requires 0≤ 4< 2. This generalizes the linear case, and it is an example of a semivariogram that does not correspond to a stationary process (Gaetan and Guyon, 2010).

9. The Matérn class: This method which was given by (Matérn, 1960) neglected in favour of simpler analytic forms. (Handcock and Stein, 1993, Handcock and Wallis, 1994) demonstrated the flexibility of this method in

124 handling a variety of spatial data set. The class is best defined in terms of isotropic covariance. Therefore,

È ( ) = 1

2^t^Õ^,Z(r) «2[r r

tÕ

Ú_t_Õ«2[r r

where r > 0 is the spatial scale parameter and r >0 is a shape of parameter, Z(. ) is the gamma function, Ú_t_Õis the modified Bessel function.

For most of the variograms, \(0) = 0, but \ increases from a non-negative value near t = 0 (the nugget) to a limiting value (the sill) which is either attained at a finite value t = R (the range). The shape of the semi variograms have the form which is presented in Figure 6.1 (Clay and Shanahan, 2011).

Figure 6. 1: Idealized form of variogram function, illustrating the nuggest, sill and range

For positive nugget, it is paradoxical because the positive nuggets imply discontinuity in the covariance function. This situation is a well-known feature of spatial data. Furthermore, these cases have various explanations. Among the possible explanations, the simplest explanation related to some residual white noise over and above any smooth spatial variation (Waller and Gotway, 2004).

125 To deal with anisotropic processes, there are a number of direct generalizations. From these methods, the simplest method is geometric anisotropy. A semivariogram with the form of geometric anisotropy is given by

\(ℎ) = \(‖ℎ‖)

where \ is an isotropic semivariogram and is a # × # matrix, representing a linear transformation of É^ñ. If is the identity this reduces to isotropic case, the process is isotropic in some linearly transformed space. Furthermore, for a positive definite matrix A, the contours of equal covariance are ellipses instead of circles. To generalize the anisotropy, let the simple independent intrinsically stationary process be , . . . , . Then

= + . . . +,

is also intrinsically stationary, with semivariogram given by

\(ℎ) = \(ℎ)+ . . . + \(ℎ),

\, . . . ,\ denoting the semivariograms of , . . . , respectively. Thus

\(ℎ) = ∑\(ℎ),

where \ is an isotropic semivariogram and , . . . , are matrices, is a valid semivariogram generalizing geometric anisotropy which is called zonal anisotropy (Gaetan and Guyon, 2010).

Moreover, for some nonlinear function %()), the process (%())), rather than ()), is a stationary isotropic process. Therefore, non-stationarity as well as non-isotropic cases can be handled (Sampson and Guttorp, 1992). Spatial covariance or semivariogram function can be defined arbitrarily. To define the function, positive definiteness has to be satisfied. Generally, :ª E()), ())F= È(), )). But, this equation does not support any form of stationary condition.

Therefore, for positive definiteness, the relation

126

< < $$ÈW), )Y ≥ 0

This relation holds for any finite set of points ), . . . , ) and arbitrary real coefficients $, . . . , $. Furthermore, based on Bochner’s theorem, the left hand side of the above relation is the variance of ∑ $ ()). For # dimensional stationary process, Bochner’s theorem implies that

È(ℎ) = ú . . . ú cos(^]ℎ) %()#

where Ü(#) = %()# the integral is over É^ñ and Ü is a positive bounded spectral measure (Cliff and Ord, 1981). For

ú . . . ú|È(ℎ)|#ℎ < ∞

G is automatically differentiable. For positive definiteness, %() ≥ 0 for all . Therefore, if the process is isotropic, È(ℎ) = È(‖ℎ‖) for some function È of univariate argument, then the spectral representation simplified to

È( ) = ú i_ñ( )^(#)

(,%)

where Φ is non-decreasing on \0, ∞) with ^(#) < ∞ and

i_ñ( ) = ä2 æ

(ñ,) ⁄

Zä# 2æ _ñ,

( )

and __×(. ) denotes the Bessel function of first order v (Schabenberger and Gotway, 2005). Moreover, there is corresponding theory for the variogram. For second-order stationary process of semivariogram \(. ), if $, . . . , $ are constants with ∑ $ = 0, then

127

< < $$\W)− )Y≤0

Therefore, this equation is a conditional non-positive definiteness condition (Cressie, 1993).

Dalam dokumen Use of statistical modelling and analyses of malaria rapid diagnostic test outcome in Ethiopia. (Halaman 138-145)

Chapter 6 Spatial distribution of malaria problem in three regions of

6.2 Models for spatial correlations

)

,

9 ~ Ü$(0, Σ

( θ ))

ý~Ü$(0, j

Ç) ,

)

)

)

− )

₍ θ ₎₎

_{ý~Ü$(0, j}

_Ç) _,