CHAPTER 3 DATA AND METHODOLOGY
3.4 Data Processing and Derivation
3.4.11. Land use land cover map
3.4.11.2 Ground Truthing, Training Sites and Signatures Generation . 130
Ground truth for image classification is the actual information of LCLU of each category of LULC. In this research, ground truth was obtained from topographical map, visual interpretation of the Landsat and SPOT 5 images, and field survey using GPS. The latter was carried out along the main road. The picture of each category of LULC was recorded using a digital camera. Fig. 3.34 portrays sample of LULC categories.
The information of ground truth gained from was used to assist digitizing the homogeneous areas on the Landsat image using polygon tools. The polygons of homogeneous area are called as training sites. Training areas for each LULC Level II categories were digitized over the Landsat image based on visual interpretation of recognizable homogeneous areas. Classified image resulted from unsupervised classification was also used to assist identification of homogeneous areas for use as training areas in supervised classification. Visual interpretation of SPOT 5 image provided useful aid for clustering the object because of the high clarity and resolution of this image. Digitization of training areas was done using ArcGIS. A polygon shape layer was first prepared and assigned with RSO projection system. All the training areas are shown in Fig. 3.35.
In order to distinguish among the training areas, each of them was assigned a specific code as follows: 0-forest, 1-lake/river, 2-urban, 3-small vegetation/cropland, 4-cut slope/barren land, 6-open land. Before classification carried out, the spectral signatures were derived from each training areas. The spectral signatures were used to examine the signature separability between each class to see the level of separation between pairs of classes. Signature separability determines the accuracy of the classification.
Fig. 3.34 Various ground truth for LULC classification
Fig. 3.35 Training areas for image classification of the Landsat image
Forest, cut slope Cropland Tea Plantation
River, lake Urban Built up
Barren/open land,
crop land Tea plantation, forest
Cut slope, thin vegetation
The signature separability of classes is presented in a dendrogram. A dendrogram is explained in ESRI [63] as a diagram that describes the attribute distances between each pair of sequentially merged classes. The diagram is graphically arranged so that members of each pair of classes to be merged are neighbors in the diagram to avoid crossing lines. A dendrogram has classes or clusters in a signature file arranged relative to one another using the multidimensional distance separating the classes in attribute space. The signature file of the Landsat image was used as the input for dendrogram function. The output of the dendrogram is shown in Table 3.12.
The dendrogram is explained as follows: firstly, urban/built up (2), Open land (6), and cut slope/barren land (4) have low separability. The next, at a distance of 1.380, urban/built up and open land will merge; urban/built up and cut slope/barren land will merge at distance of 1.490. Small vegetation coverage/cropland (3) has a larger distance of 3.611 to separate from urban (2). Meanwhile, forest (0) can be easily separated from urban (2) and lake-river (1) because it has farther distances by 3.824 and 11.985 respectively. The term distance in this case is the distance between pairs of classes calculated based on their means and variances that can be found in output file of signature generation. The distance can also be viewed as the distance in a multidimensional space. Low separability between pairs of classes will reduce the accuracy of image classification. Low separability of built up (2), open land (6) and cut slope/barren land (4) was overcome by means of merging open land and cut slope/barren land as one class. Urban/built class was clearly distinguishable from open land/barren land/cut slope (Fig. 3.36). This class is shown in red. The surrounding areas are open land and barren land. Based on these signatures, the final classes were expected to have 5 LULC categories: urban/built up, forest, lake/river, cropland/thin vegetation/small vegetation, open land/barren land/cut slope.
Table 3.12 Dendrogram of the Landsat image Signatures
Distances between Pairs of Combined Classes (in the sequence of merging)
Remaining Merged Between-Class Class Class Distance --- 2 6 1.379886 2 4 1.489691 2 3 3.611925 0 2 3.823692 0 1 11.985283 ---
Line width of Dendrogram: 78
Dendrogram of d:\_arcgi~1\main_d~1\creates_le7_20sep.gsg C DISTANCE
L A
S 0 1.3 2.7 4.0 5.3 6.7 8.0 9.3 10.7 12.0 S |---|---|---|---|---|---|---|---|---|
6 ---|
|
2 ---|---|
| | 4 ---| |- ||
3 ---||---|
| | 0 ---| |- | 1 ---|
|---|---|---|---|---|---|---|---|---|
0 1.3 2.7 4.0 5.3 6.7 8.0 9.3 10.7 12.0
3.4.11.3 Image Classification and Raster Generalization
The next step after collecting object signatures and analysis their separability was to undertake supervised classification. This process is intended to assign each cell of satellite image of the study area to a known class as using statistic of signature information of each class. It contains multivariate statistics of each class or cluster necessary to conduct image classification. The expected result is a map containing partition of the study area into known classes of LULC.
A supervised image classification was applied to a Landsat image acquired on 20 September 2001 particularly using reflectances of band 5, 4 and 3. It was conducted by using maximum likelihood classifier. LULC map resulted from this process comprises of five features namely urban/built up, forest, lake/river/water-like surface, cropland/thin vegetation/small vegetation, open land/barren land/cut slope (Fig. 3.37).
The pixel size of the map is 30 m x 30 m.
Fig. 3.36 Separability of urban with the surroundings seen on classified image and the corresponding image and topographic map
Fig. 3.37 Land Use Land Cover Map
The classified image contains a large number of small clustered pixels that is usually called as ‗noise pixels‘. Before further utilizing the classified image, raster generalization was applied to the classified image in order to either to clean up small erroneous pixel in the raster data or to generalize the data to remove or smooth unnecessary detail for a more general analysis. The erroneous pixels may be
Satellite image Classified image Topographic map
unclassified data originating from the satellite image. In doing so, raster generalization was performed by using Majority Filter method. This method replaces cells in a raster based on the majority of their contiguous neighboring cells.
Contiguous means sharing an edge for a kernel of EIGHT and sharing a corner for a kernel of rectangular regions. This work used a kernel of EIGHT. Detail explanation in regard to this subject can be referred to ESRI [63].
Illustration of raster generalization is given in Fig. 3.38a. The input raster contains small clustered pixels such as pixels with values of 6, 0 -3, and 2. After applying Majority Filter Generalization, these erroneous pixels in output raster are replaced by the majority contiguous neighboring cells. Fig. 3.38b portrays generalization result applied to classified image of part of the study area. Left picture is the original classified image while right picture displays the image after application of generalization. Erroneous pixels within the delineated lines (left picture) were removed (right picture).
Calculation of the area of all classes was done after raster generalization completed. The number of pixels of each class was multiplied by 30 m x 30 m and the result is shown in Table 3.13. Forest dominates the study area by occupying 74% of the area followed by cropland/small vegetation that occupies 20% of the area. Small part of the area is occupied by open land, river/lake and urban with their respective areas are 5%, 2% and less than 1% of the total area. The value of the area of all classes has been corrected from the effect of cloud and its shadow. The actual image contains cloud cover around 9% of the total area. The procedure in regard to fixing cloud cover problem is discussed in section 3.4.15.
Table 3.13 Statistic of LULC map derived from the Landsat image
Land Use Type Number of
pixel
Area
(km2) % Cropland, Bushes, Thin vegetated area 197,954 178.2 20
Forest 713 664.3 74
River, Lake, Water-like surface 738,129 15.8 2
Urban, Built up 45,685 0.6 0
Open land, Cut slope 17,502 41.1 5
Total Area 900
Fig. 3.38 a) Illustration of generalization, b) before and after generalization The reason to provide only five categories of LULC was meant to conform to the final landslide susceptibility map that divides the susceptibility of the study areas into five categories: very high, high, moderate, low, very. In addition, it was also mean to facilitate derivation of weighting system.