CHAPTER 4: IDENTIFICATION AND MAPPING OF ANOMALOUS PATCHES IN COFFEE
4.2 Materials and methods
67
anomalies is mostly likely to produce better representation of crops than using the global means.
In this study, it was hypothesized that age-adjusted Landsat 8 NDVI and LSWI anomalies can be used to identify incongruous patches in coffee plantations as part of development and application of spatially explicit geospatial, agronomic and economic tools to advance productivity of perennial tree crops. In addition to most of the past studies on the use of NDVI and LSWI anomalies being on annual crops, they have not demonstrated the applicability of the determined anomalies in farm and field-level management decisions. The aim of this study was therefore to assess the utility of multi-temporal age group adjusted Landsat 8 derived NDVI and LSWI data in detecting and mapping anomalous growth patches in continuously growing agricultural crops of economic importance, using coffee as an example.
4.2 Materials and methods
68
Figure 4.1: Map of the study area showing the distribution of coffee ages used in the study adapted from Chemura and Mutanga (2016). The insert shows the location of the study area
in Eastern Zimbabwe.
4.2.2 Satellite imagery 4.2.2.1 Data acquisition
Nine Landsat OLI scenes (path 168 and row 74) were obtained for the study area from the USGS-EROS Centre archive (www.earthexplorer.usgs.gov) covering a period from September 2014 to January 2016. This period covers three contrasting seasons for sufficient monitoring of perennial tree growth. The images were acquired as Level 1 terrain corrected (L1T) products with systematic geometric correction, orthorectification with a digital elevation model (DEM) and precision correction assisted by ground control chips (Roy et al., 2014; Peña & Brenning, 2015). The reference period for data collection was December 2014 when field data was available. Three image scenes were selected for each season (October to March is the rainy season and May to September in the dry season) at a temporal spacing of about a month. A spacing of a month was considered sufficient for temporal monitoring of coffee crop health and to provide opportunities for obtaining images with less cloud cover for practical applications. The dates, day of year (DOY), cloud cover percentage and sun elevation angles of all the image scenes are shown in Table 4.1. Image scenes with the lowest percentage cloud cover over the study area were selected.
69
Table 4.1: Detail on Landsat 8 Scenes used to derive NDVI
Scene Date Day of year Cloud cover (%) Sun Elevation Local Season
15 Sept 2014 258 0.00 53.5 Wet
17 Oct 2014 290 7.00 62.5 Wet
04 Dec 2014 338 0.00 64.4 Wet
13 May 2015 133 0.05 41.9 Dry
14 June 15 165 0.28 37.5 Dry
16 July 2015 197 1.70 38.2 Dry
20 Oct 2015 293 7.24 62.9 Wet
23 Dec 2015 357 24.00 102.1 Wet
08 Jan 2016 008 18.00 60.6 Wet
4.2.2.2 Data pre-processing
Since derivation of vegetation indices from satellite imagery is known to be sensitive to atmospheric conditions and also given the diversity of the dates and seasons of the Landsat 8 OLI scenes, atmospheric correction was therefore necessary. Landsat 8 OLI L1T Digital Number (DN) images were first converted to TOA reflectance based on the available Landsat 8 OLI calibration coefficients and standard correction formulas (http://landsat.usgs.gov/Landsat8_Using_Product.php) as described by Chander et al. (2009).
These were then used to perform atmospheric correction of the 9 Landsat 8 OLI image scenes using the Fast Line-of-sight Atmospheric Analysis of Spectral Hypercube (FLAASH) radiative transfer model in ENVI 5.1 Software (Exelis Visual Information Solutions, Inc.;
Boulder, CO, USA). This method takes into consideration the most likely atmospheric conditions for each DOY. Cloud and cloud shadow contamination were then detected and removed using mask layers generated with the “Fmask” method and code (Zhu & Woodcock, 2012; Zhu et al., 2015). Since the period between the first and the last image is only 18 months, there was no need to account for satellite signal decay over time. In order to ensure that there was a spatial match of the image pixels that were acquired at different dates, further geometric co-registration of the stacked Landsat 8 OLI scenes was performed using ten-ground control points. These were collected using a GPS with ~3m accuracy at noticeable places using first order polynomial (Affine) transformation. Root mean square errors (RMSE) obtained from each scene were checked and were considered acceptable, as they were all less than half the pixel dimensions.
70 4.2.2.3 Derivation of vegetation indices
NDVI was derived from Landsat 8 OLI band 5 (NIR, 0.851–0.879µm) and band 4 (Red, 0.636–
0.673µm) as shown in Equation 4.1 (Rouse et al., 1973; Tucker, 1979). LSWI was derived from band 5 (NIR) and band 6 (SWIR1, 1.57–1.65µm) as shown in Equation 4.2 (Xiao et al., 2005; Torbick & Salas, 2014). Both bands had 30m ground sampling distance (GSD) and a radiometric resolution of 12 bits.
NDVI =
𝜌nir − 𝜌red𝜌nir + 𝜌red [4.1]
LSWI =
𝜌nir − 𝜌swir𝜌nir + 𝜌swir [4.2]
4.2.3 Field data and statistical analysis
Field data was collected during a field campaign conducted between the 15th and 16th of December 2014. During field campaigns, only healthy coffee sample locations were marked using a GPS and their age, variety and other general characteristics were also recorded (N=18).
Three age groups which are young (1-4 years), mature (5-8 Years) and old (9 years +) were used for age-adjustment of the anomalies. To increase the number of samples for statistical analysis, the mean and the standard deviation of the 18 sample points were used to obtain pixels that represent healthy coffee as those whose NDVI values lie within two standard deviations of the mean of healthy coffee. Using this approach the number of samples increased to 71 for each age class (N=213). The one-way analysis of variance (ANOVA) was performed in R (R Core Team, 2013) by taking age class as independent variable and pixel values extracted from sample points as dependent (response) variables (α≤0.05). Coffee NDVI and LSWI values extracted from field sample points (N=213) were first tested for the assumptions of normality using the Shapiro-Wilk test before being subjected to ANOVA. Where means were significantly different, ANOVA was followed by the Tukey pairwise comparison test for mean separation. The three coffee classes were selected because of their significance in determining coffee productivity. There is little to no productivity in the first four years of coffee (young) followed by maximum productivity between four and eight years (mature) and dwindling yield thereafter in the old age group (Logan & Biscoe, 1987; Nair, 2010). These groups also correspond to management requirements for coffee as intensive monitoring is required in the young and the old age groups to ensure productivity.
71 4.2.4 Anomaly detection and mapping
In order to identify anomalous coffee patches, a method that uses pixel-based difference for identifying incongruous crop areas was adopted (Lanfredi et al., 2015; Venteris et al., 2015).
The procedure presented here involves a statistical analysis of anomalies to identify incongruent patches in perennial coffee plantations. The analysis was done by identifying and analysing departures from expected mean values, with and without putting into consideration their age classes. The age map was derived from the map generated by Chemura and Mutanga (2016), Chapter 3. The mean coffee NDVI and LSWI from the sample points were determined and used as the global coffee NDVI and LSWI means on which deviation of all coffee pixels was calculated. The global mean is the average coffee NDVI and LSWI for all the points and was computed for each image scene date. In addition, the average coffee NDVI and LSWI values for each age class were determined from the group means in the ANOVA procedure and used as the age- expected values for a particular age class. Using this age-class based mean, the deviation of each pixel from the age-expected mean for each image scene date was calculated using Equation 4.3.
𝐼𝑛𝑑𝑒𝑥𝐷𝑖𝑓𝑓(%) = 𝑋𝑖𝐴−𝑋̅𝐴
𝑋̅𝐴 ∙ 100 [4.3]
where 𝑋𝑖𝐴 is the pixel NDVI or LSWI value in the image and 𝑋̅𝐴 is the mean NDVI or LSWI for a particular age group where the pixel belongs. This produces the distribution of deviation from the mean of NDVI and LSWI values around 0 for each age class and for each Landsat scene. This transformation is important as it filters the amplitude NDVI and LSWI deviation of each age group into values between 0 and -100 (patches that are less healthy than the normal sampled plots), and +100 (higher than the mean representing those pixels that are healthier than the normal sampled plots) around their age-class mean.
To determine the pixels with excessive NDVI and LSWI deviation, focus was put on the left tail where values were less than the mean. We used the inverse cumulative distribution function (ICDF) to obtain areas that are excessively lower in NDVI and LSWI than the mean for all identified coffee pixels. This was done for both the deviation of the global mean and the age- based anomalies. The ICDF gives the value associated with a specific cumulative probability given the mean and the standard deviation of that data. The cumulative distribution function and the inverse ICDF are related by Equation 4.4 and Equation 4.5. The NDVI and LSWI anomalies were tested for normality using Shapiro-Wilk test before application of the ICDF to ensure that the assumptions of normality are met.
72
𝑝 = 𝐹(𝑥) [4.4]
𝑥 = 𝐹−1(𝑝) [4.5]
The threshold probability at 0.05 (outside two standard deviations) was set to be incongruent patches that were not performing well, when compared to their age peers. The area for those pixels was calculated and mapped in ArcGIS 10.3 (ESRI, Redlands, CA, USA). These calculations were done for each age and each scene from September 2014 to January 2016.
4.2.5 Model evaluation
The approach applied in this study was evaluated using the final binary coffee condition map obtained from one independent image scene. Reference data on coffee condition were collected using visual interpretations of very high resolution (VHR) imagery within Google Earth®) as described by (Clark & Aide, 2011). Google Earth VHR is increasingly being used in evaluation of remote sensing approaches particularly for those that involve historical archival data such as in this study (Dong et al., 2014; Landmann & Dubovyk, 2014; Reschke
& Hüttic, 2014; Zhou et al., 2016). The sampled VHR imagery (taken on July 10, 2015) was close to the July 16, 2015 Landsat scene and this was therefore considered appropriate for evaluation. The image provider in the study area on which reference data was obtained is Digital Globe. The interpretation operator applied consistent rules including minimum area, observable crop density, crop stand and soil background cover for identification of a pixel as incongruous or normal. Only areas larger 30m x 30m were considered and interpreted in order to match Landsat 8 resolution. The kml reference data points were exported to ArcGIS 10.3, re-projected and overlaid on the final incongruous/normal map and used to extract the condition for statistical analysis. A total of 47 points were used for validation (34 healthy and 13 unhealthy) from the VHR imagery. Using this a confusion matrix from the reference validation data and classified map was developed.
To evaluate model performance, five overall performance (overall accuracy (OA), prevalence, area under the receiver operating curve (AUC), false positive rate (FPR), false negative rate (FNR) and two class specific (producer’s and user’s accuracy) assessment measures were used (Foody, 2002; Brenning, 2009). The process of identifying healthy and incongruous patches in coffee plantations used in this paper is summarized in Figure 4.2.
73
Figure 4.2: Flowchart of the methodology to detect and map incongruous patches in coffee plantations using time series global and age-adjusted Landsat 8 NDVI and LSWI anomalies.