General Test for Detecting Spatial Clustering (Moran I Indices and Local Indexes)

3.5. Spatial Analysis

3.5.6. General Test for Detecting Spatial Clustering (Moran I Indices and Local Indexes)

GMI is a spatial statistics tool used to describe the overall spatial pattern of attributes over definite geography. In this case, GMI is used to determine the spatial correlation of diabetes prevalence among the entire population across the Africa continent (Aselin, 1996; Daniella et., al., 2016). The goal of GMI in spatial autocorrelation is to summarize the degree to which similar observation tend to occur near each other. That is, we want to check if the same pattern (of diabetes) or process occurs over the entire geographic area. However, it is good to know that global statistics only suggests that there is clustering but does not identify the areas of clusters. Therefore, GMI is often used first to determine if there is an evidence of spatial association. Therefore, the similarities of values at location 𝐵_𝑖 𝑎𝑛𝑑 𝐵_𝑗

2 0

( )( )

( )

ij i j

i j

i i

w x x x x I n

S x x

 

 





are weighted by the proximity of i and j. The weight 𝑤_𝑖𝑗defines proximity. The extent of similarity is represented by the weighted average of similarity between the populations.

Thus, the GMI tool is an inferential statistic which provides a test of the null hypothesis, which states that the spatial distribution shows a complete randomness of the attributes being studied. That is, the attribute value of the location does not depend on the values of neighboring locations.

3.5.6.2. Monte Carlo Simulation

There is currently no simple and systematic way and method of significance testing in spatial clustering, these methods are normal permutation and Monte carlo simulation. Monte Carlo simulation was used in order to minimise and understand error in the map. This is a permutation test for Moran I statistics, calculated by using random permutations for the given spatial weighting scheme, to establish the rank of the observed statistics in relation to the nsim simulated values.

This approach repeats randomization of the observation a large number of times (for example 𝑁_𝑠𝑖𝑚 ~ 999). A Moran’s I statistic is calculated for each randomization and can be compared to the Moran’s I. The function “moran.mc” is used to carry out the test in R.

3.5.6.3. Moran Correlograms

Spatial correlograms are used to examine the patterns in spatial autocorrelation. The correlograms of the Moran’s I statistic are used to determine the appropriate number of neighbors or distance. They are used to show how correlated pairs of spatial observations are when we increase the distance (lag) between them. This approach is used to calculate Moran’s I based on the number of k’s for a range of k lags (also known as spatial lag).

3.5.6.4. Local Indexes of Spatial Autocorrelation (LISA)

Global statistics can establish if there is clustering, but do not identify the areas of particular clusters.

In spatial analysis, the global test is often used first to determine if there is evidence of spatial association. Once that is detected and established, it is necessary to detect local areas of similar values. This is done by estimating local statistics such as Local Indexes of Spatial Autocorrelation (LISA).

The aim of the LISA test is generally to detect local areas of similar values. It therefore requires local statistics. LISA tests are decompositions of global indicators into the contribution of each individual observation, indicating the extent of significant spatial clustering of similar values located around that observation. LISA can be used to detect clusters, it allow for a classification of the significant locations as high-high and low-low spatial clusters and high-low and low-high spatial outliers. In this case, a reference to high and low is relative to the mean of the variable and should not be interpreted in an absolute sense. Usually four types of spatially associated cluster can be identified. These are:

High-High (denoted as H-H) : this is a cluster region where those above mean prevalence are neigboured by other similar countries

Low-Low ( denoted as L-L) : this is the cluster states where those with below mean prevalence is neigboured by other similar countries

Low-High ( denoted as L-H and High –Low ( denoted as H-L) : are cluster countries where countries of isolated low and high prevalence with similar neighbors).

Similarly, from Local Moran’s I also known as Hot spot concentration is on the spatial concentration of high-low and low-high values that is the spatial outliers

As a result, the sum of LISA’s for all observation is proportional to the Global Moran I, denoted as:

𝐼_𝑡 = ∑ ∑ 𝑤_𝑖𝑗𝑧_𝑖𝑗

𝑇

𝑡 𝑗=1 (3.16)

where 𝑧_𝑖𝑗 is the elements of a spatial contiguity matrix. Under the null hypothesis of no spatial association, the moment for 𝐼_𝑡 statistics can be derived for a randomnisation hypothesis.

Generally, with LISAs, each test gives an indication of the extent of significant spatial clustering of similar values located around that observation of interest. Therefore, the location of a particular observation is identified and defined as a neighborhood, and then formalized with the spatial adjacency weight matrix, W. However, it should be noted that W can be based on the sharing distance from one location to another or share a border in full or partially.

Waller and Gotway (2004), (Elliott and Wartenberg, 2004) describe spatial outliers as a spatially referenced object whose non-spatial attributes are significantly different from those of other spatially referenced objects in its spatial neighbourhood. The two types of spatial outliers are multi- dimensional space-based outliers which use Euclidean distances to define their spatial neighbourhoods and graph-based outliers which use graph connectivity.

The results of the LISA maps clusters which is used to identify the outliers (hotspots) is shown in chapter four.

3.4.6. Conclusion

In investigating spatial correlation, spatial autocorrelation test is suggested using GMI. However, with limitation of GMI which only shows the global pattern, additional method of Moran I was used to investigate spatial pattern as the regional level which helps in detecting the similarity function of the prevalence among the regions. There is a need to identify the outliers (hotspots) countries in other to help formulate network collaboration in the control over diabetes.

3.6. Distribution for Count data

Dalam dokumen Investigating the spatial distribution of diabetes in Africa using both classical and Bayesian approaches. (Halaman 58-61)