Chapter 2: Literature Review
2.6 The Use of Textural Analysis in Forestry Classification Applications .1 Introduction
The purpose of image classification is to characterise every pixel in an image into meaningful classes in order to extract meaning from the image. This is achieved either by utilising the relationship of spectral (or radiance) properties to one another, or the spatial relationships of the pixels, such as texture, shape, size or context (Lillesand and Kiefer, 2000). A third means is based on the temporal relationships between pixels, where the time element is used to distinguish classes, and it is this factor on which change detection is based.
2.6.2 Principles of Textural Analysis
While standard optical remote sensing classification techniques are based on the spectral characteristics inherent in imagery, incorporating ancillary information such as texture, context and structural characteristics has been shown to improve classification results (Ouma et al., 2006; Tso and Mather, 2001; Coppin, 1991; Fung and LeDrew, 1987). It is interesting to note that these characteristics preceded the development of spectral analysis techniques, as they form the basis of the original manual image interpretation techniques utilised in extracting information from aerial
photography (Tso and Mather, 2001; Janssen, 2000; Coppin, 1991), and are now of increasing importance due to their role in object-orientated segmentation and classification procedures.
Texture refers to the visual roughness or smoothness of features across an image, due to the spatial variability of tonal values, which result in a repetition of patterns across the image (Tso and Mather, 2001). This concept deals with the structure of the object itself, based on the tonal variation within itself (Le. the "within object"
variance, also referred to as the microstructure (Jonckheere, 2000».
Tso and Mather (2001) define context as the probability of occurrence a group of pixels will have, based on the pixel nature across the whole scene. Context describes the relationship between an object and the rest of the scene (Le. the
"between objects" variance, also referred to as the macrostructure (Jonckheere, 2000». It is usually applied through a statistical technique such as a majority filter window, which is used to refine a classification.
It should be noted that spectral and textural features are inter-related, and complement, rather than duplicate, information derived from both sources (Coppin, 1991). Both properties are always present in an image (Haralick et al., 1973), but the degree to which one is dominant over the other is a factor of the resolution of the one compared to the other. Where spatial resolution is high in relation to the scale of tonal variation, texture can improve class discrimination in a classification. The converse is also true where homogenous areas within an image are small, as texture is a feature of an area rather than a point (Tso and Mather, 2001).
Textural analysis can be applied using four different theoretical approaches. These are the frequency domain theory (Jensen, 1996); a statistical approach (Jonckheere, 2000); joint grey-level probability density, after Haralick et al. (1973); and fractal theory (Tso and Mather, 2001). All of these methods involve the quantification of texture patterns. Of these four theories, two were considered to be applicable in this study; these being the frequency domain and the statistical approach, and detailed descriptions of these procedures are given in Chapter 5.1.3, Methods. A Fourier Transform technique, used to measure elements in the frequency domain and
several statistical procedures, utilising convolution filter masks, as well as the variance and skewness functions were tested.
2.6.3 Statistical Approach
When using a statistical approach for the determination of textural characteristics, a fundamental principle on which this is based is that of spatial autocorrelation, Le.
things closer together are more likely to be similar than things further apart (Meisel and Turner, 1998). The basis on which this is measured is determined by the size of the convolution window applied. Therefore it is imperative that the optimal window size is determined prior to any analyses being undertaken. This can be done using semivariograms (Tso and Mather, 2001; Jonckheere, 2000; St-Onge and Cavayas, 1997).
2.6.3.1 Semivariograms
The semivariogram is a mathematical function that correlates the dissimilarity, or semivariance, of points within a data set to the distance between them, and when viewed graphically, describes the spatial correlation between all data points in the data set (Johnston et al., 2001). It is a dissimilarity function because the variance of the difference increases with distance.
Mathematically, the semivariance function is defined as follows:
N(h)
A(h)
=
YzN (h)~)Xi-yY
(Equation 2.1)i=!
where:'A(h)= semivariance at lag distance h;
N(h) = number of data point pairs separated by h;
Xi
=
value at the start of the pair;Yi
=
value at the end of the pair.(Meisel and Turner,1998)
Graphically, a typical semivariogram (a plot of the semivariances A(h) is as follows (Figure 2.1):
Sill,/
1
...
s:'-'"
-<
s
c:'Cro ro>
'EQ) (J)
• • • • • • • • • • • • • • • • • • • • • •~• .s-a-o...~
I I I
Lag (m)
, .... .---_.1 .,
Range
Figure 2.1 Illustration of a Typical Semivariogram
from which four critical measures, Lag; Sill; Range; and Nugget, are derived.
By fitting a recognised model (e.g. spherical; linear; circular or exponential) to the discrete points of the semivariogram a continuous line can be obtained, from which the range, sill and nugget can then be measured (Johnston et al., 2001; Meisel and Turner, 1998).
The Lag is the distance between the data point pairs. To reduce the large number of possible combinations, these are usually grouped into distance classes, in a process called "binning".
The Sill is the maximum value of the semivariance l\(h) , Le. where the graph reaches a plateau.
The Range is the lag distance at which the graph levels off, and is usually set at the point where the sill reaches 95% of the semivariance l\(h). Beyond this point there is little or no autocorrelation between the variables. It is this factor that provides the means to determine the optimal window size (Tso and Mather, 2001).
The Nugget is the measurement or independent error parameter, and is the distance from the origin to where the graph intercepts the semivariance l\(h).
(Johnston et al.,2001; Jonckheere, 2000)
Two assumptions on which semivariance analyses are based are those of stationarity (i.e. any variation is due to separation distance alone) and anisotropy (i.e. no directional trends occur in the data). However, these are not always met when working with natural phenomena (Johnston et al., 2001; Meisel and Turner, 1998). An advantage of using semivariance is that it tends to be insensitive to variations in contrast across consecutive images (St-Onge and Cavayas, 1997), which is an important consideration in a study such as this one, which is based on repeat images.
Various authors have applied semivariance analyses in studies of forest stands and attributes. Woodcock et al. (1988) calculated variograms for aerial photography and TM imagery of forest stands. Cohen et al. (1990) applied variograms in their studies on forest canopies. Other forestry applications of semivariance analyses have been done by Wulder et al. (2000); St-Onge and Cavayas (1997) and Hyppanen (1996).
2.6.4Frequency Domain Approach
Textural analyses may also be carried out using techniques that have their theoretical bases in the frequency domain. In image processing, this is usually achieved through the application of a Fourier Transform. Fourier analysis is the mathematical technique of transforming an image's spatial components into its frequency components (Jensen, 1996). The frequency spectrum is called the magnitude of the Fourier Transform, and is displayed as a two-dimensional image.
This image represents the magnitude and direction of the different frequency components of the input spatial image (Jensen, 1996), and is symmetrical about its centre. These images are in the form of a disc, with low frequency information close to the centre of the disc, i.e. the co-ordinate origin, and high frequency information further out to the extremity of the image (Fisher et al.,2003; Tso and Mather, 2001).
The square of the amplitude spectrum is known as the Fourier power spectrum, and the angular distribution of the power spectrum values is sensitive to structural directionality present in the spatial domain (Tso and Mather, 2001).
If there are many edges or strong linearity at a specific angle in the spatial input image, the power spectrum image will display high values concentrated around a direction perpendicular to the angle in the spatial domain. They appear as bright dots, and a line connecting these dots to,the centre of the image is always
orthogonal to the orientation of these lines in the input image. Thus the frequency and orientation of the lines can be determined (Jensen, 1996). This phenomenon was of particular interest to this study. However, these bright dots could also be the result of other features such as noise, and a filtering operation has to be done in order to separate noise and other linear features from the particular linear features of interest. Textural roughness can also be extracted using the power spectrum functionality, as the radial distribution of the power spectrum values is sensitive to this roughness or smoothness (Tso and Mather, 2001). Smooth structures produce high power spectrum values away from the origin, while coarse structures produce high values close to the power spectrum origin.
The filtering of power spectrum images is generally done using wedge or ring filters, where the angle and radius values of these filters control the width of the filter's effective area (Tso and Mather, 2001). Wedge filters are used to highlight directionality, while ring filters highlight textural roughness (Jonckheere, 2000).
While textural analyses can be done using the features of the frequency domain, for general image processing applications spatial domain filtering is more cost effective (Jensen, 1996), and it is certainly less complicated. Fourier methods become more appropriate when the filter functions required to perform the image processing become very large. There are also some specialised filter functions that are better done using Fourier techniques than spatial domain techniques (Jensen, 1996).
Textural analysis is generally done using a single band (Coppin, 1991; Fung and LeDrew, 1987), but Coburn and Roberts (2004) describe an alternative multiscale methodology using several bands of textural information with different window sizes, which resulted in a 40% improvement in the classification of forest stands.