• Tidak ada hasil yang ditemukan

CHAPTER 5: MULTISPECTRAL LEVEL REMOTE SENSING OF PLANT WATER CONTENT IN

5.2 Materials and methods

The study was carried out at Coffee Research Institute (CoRI) in Chipinge, Zimbabwe. CoRI is located at coordinates 32°37.523’E and, 20°12.474’S at an altitude of 1100 m.a.s.l. The average annual rainfall is 1180mm of which 80% falls in five months from November to March.

The mean maximum temperature is 20˚C and minimum is 14˚C. Most of the soils in this area are leached and strongly weathered and in the Orthoferralitic group derived from Umkondo quartzite and sandstone (Chemura, 2014). Chipinge is in the main coffee production zone in Zimbabwe.

5.2.2 Experimental materials and design

Six months old healthy coffee seedlings (variety Catimor 129) were used in the study. These seedlings were transplanted into black polythene pots (29cm x 13.5cm) at four months and were allowed two months to acclimatize the conditions of the greenhouse before

91

experimentation. This age represented the minimum age at which coffee can be transplanted into the field. The seedlings were grown in the nursery in recommended growth medium for raising coffee seedlings and all other routine nursery management activities were based on nursery recommendations from the Coffee Handbook (Logan & Biscoe, 1987).

5.2.3 Moisture stress treatments

The six months old seedlings were subjected to water stress treatments by withholding and varying water supply to the plants to obtain a range of soil moisture. Three treatments were used for inducing plant water stress; each replicated 20 times (60 plants). The first batch had no moisture stress where the seedlings were provided with required irrigation twice a week (not stressed). The second had plants being provided irrigation once a week (moderate stress) while the third had plants that were not provided with irrigation water for two weeks prior to spectral assessments and PWC measurements (severe stress). This allowed for a gradient in plant water content.

5.2.4 Spectral reflectance measurements

On each plant, spectral reflectance was measured on one of the leaves on the third node from the top. This was done to sample mature coffee leaves that are representative of those in the field. Reflectance was measured using an Apogee VIS-NIR spectrometer (Apogee Instruments, Inc., Logan, UT, USA) which has an effective spectral range of 400-900 nm and a spectral resolution of 0.5 nm. Each reading consisted of an average of three spectral scans, taken at 15 cm above the coffee leaf of interest at 30° angle. A white polytetrafluoroethylene (PTFE) reflectance standard was used as a reference, and reflectance by wavelength was calculated as the ratio of scene reflectance to the reflectance of the standard. The reflectance was averaged to 10nm to reduce dimensionality and a moving Savitzky–Golay filter with a frame-size of 5 data points and a 2nd order polynomial was employed to smooth the spectra (Savitzky & Golay, 1964).

5.2.5 Leaf water content measurements

The physiological characteristics (leaf length, width, fresh weight and dry weight) of the leaves on each plant that was used in spectral sampling was recorded. Fresh weight (FW) was measured soon after reflectance measurements, using an electronic balance on site. For determining dry weight, the leaves were placed in tagged containers and placed in an oven set at 70°C for 8 hrs and afterwards their mass was measured using the digital balance. Plant Water Content (PWC) was calculated after Liu et al. (2003) as in equation 5.1.

92

PWC(%) = FW(g)−DW)(g)

FW(g) ∗ 100 [5.1]

5.2.6 Identification of water stress related wavebands

Hyperspectral data are high dimensional and exhibit a high degree of inter-band correlation, leading to data redundancy that can cause convergence instability in models (Thenkabail et al., 2004a; Blackburn, 2007). Therefore, the use of fewer wavebands is preferable for more stable modelling of plant biophysical parameters and chemistry with hyperspectral data. It is also easier to implement in field applications. Variable selection methods attempt to capture the maximum information present in the original assorted data, while concurrently making sure that the selected data remain fit for purpose (Fodor, 2002; Demšar et al., 2013). Many variable selection methods have been reported for use with remote sensing data. Each has unique data processing capabilities and potential applications in selecting useful wavebands. The objective of all variable selection methods is to produce a list of features arranged by their discriminatory ability and therefore provides a means by which an optimal feature subset can be used in modelling from remote sensing data. Since no single best approach was available to determine the optimal number of bands required for coffee PWC modelling, three variable selection methods were used in this study, which are cross-correlation threshold, reflectance difference and reflectance sensitivity.

5.2.6.1 Cross-correlation threshold

Cross correlation is a standard measure of degree of similarity between two band matrices and is optimized for faster calculation and /or more accurate results (Ahmed et al., 2012). This method is dependent upon covariance calculation between the two wavebands, and can be adjusted for brightness and contrast using normalization (Wang et al., 2008). The higher the correlation between the bands, the more the similarity in their spectral characteristics for a feature of interest and vice-versa. Therefore, in the correlation matrix all areas of low coefficient of determination (R2) values are the waveband regions with the least redundancy and the highest information content and therefore should be retained (Thenkabail et al., 2004a).

As such, a cross-correlation threshold should be set, beyond which datasets should be considered unsuitable for modelling. As expected, the adjacent bands of the hyperspectral data had very high inter-correlations, meaning that they are redundant. In this study this threshold was set as R2=0.95, meaning that for wavebands with a coefficient of determination of more than 0.95, only one was retained.

93 5.2.6.2 Reflectance difference

The reflectance difference of the stressed treatments to the unstressed to calculate the spectral difference between the irrigated, moderate and severely stressed plants was also used. The objective of reflectance difference was to identify a few important bands based on the peak value of the reflectance difference across the three treatments, but not to seek the variation in the peak value itself. The mean reflectance values for each water deficit stress treatments was considered instead of that of individual replicates. Relative difference in reflectance, due to water stress was measured by calculating the reflectance difference at each wavelength after Riedell et al. (2003):

Reflectance Difference(RD) = (Relectance of Stressed − Reflectance of Unstressed) [5.2]

5.2.6.3 Reflectance sensitivity

The reflectance sensitivity method is an extension of the reflectance difference method where the difference between the stressed and the non-stressed leaf reflectance is considered as a proportion to the reflectance of the non-stressed samples. This method therefore normalizes the differences in spectral reflectance to the origin. The wavebands with the highest reflectance sensitivity represented the areas of the spectrum with the most important information and were therefore selected. Reflectance sensitivity was calculated after (Riedell et al., 2003):

Reflectance Sensitivity(%) = [(RD/Reflectance of Unstressed) ∗ 100] [5.3]

5.2.7 Modelling approach

The Random Forest (RF) algorithm was used for modelling PWC from selected wavebands.

RF is an ensemble machine learning algorithm developed by Breiman (2001) to solve classification and regression problems through a multitude of decision trees. RF employs an iterative bagging (bootstrap aggregation) operation where a number of trees (ntree) are independently built, using a random subset of samples from the training samples. Each tree is then independently grown to a maximum size based on a bootstrap sample of about two-thirds the training dataset. Each node is then split using the best, among a subset of input variables (mtry). The ensemble then classifies the data that are not in the trees as out-of-bag (OOB) data, and by averaging the OOB error rates from all trees, the RF algorithm gives an error rate called the OOB error for each input variable (Gislason et al., 2004; Breiman & Cutler, 2007). In many applications, this algorithm produces one of the best accuracies to date and has important advantages over other techniques in terms of ability to handle highly non-linear data, robustness to noise and tuning simplicity (Rodriguez-Galiano et al., 2012; Lebedev et al.,

94

2014). The default number of trees (ntree) of 500 was used while mtry is automatically determined as the square root of the total number of variables used (Breiman, 2001). The r package randoForest was used for running the RF modelling (Liaw et al., 2009).

5.2.8 Model evaluation

The field data was randomly partitioned into 60:40 for model training and validation respectively. Several error indices are commonly used in model evaluation and some of them were applied to compare RF model performances and to assist in identifying the best performing variable selection methods. All model evaluation metrics were performed on independent data. The correlation coefficient (r) and coefficient of determination (R2) was used to assess the goodness of fit of the predicted PWC and measured PWC values. In terms of performance, the best model should be identified as the one with the largest r and R2. In addition, mean absolute error (MAE, Equation 5.4) root mean square error (RMSE, Equation 5.5), normalized root mean square error (nRMSE(%), Equation 5.6) and percent bias (pBias (%), Equation 5.7) were used to determine the errors of the model in predicting PWC from selected variables. For MAE, RMSE, and pBias values of 0 indicate a perfect fit between measured and predicted PWC (Ghini et al., 2011). MAE is the average of the absolute values of the differences between predicted and measured values. RMSE is one of the commonly used error index statistics and the lower the RMSE the better the model performance. pBias measures the average tendency of the simulated data to be larger or smaller than their observed counterparts and positive values indicate overestimation whereas negative values indicate model underestimation (Ghini et al., 2011).

1

| ˆ |

MAE yi yi

n [5.4]

)2

ˆ 1 (

RMSE n

yi y [5.5]

min max

nRMSE RMSE

i

i y

y

[5.6]

pBias = (∑(𝑦𝑖−𝑦̂∑ 𝑦𝑖)∗100

𝑖 ) [5.7]

where for all cases n is the number of data points, yi is the measured PWC (%) at that data point and ŷi is the model predicted PWC (%) at that data point.

95