Deterministic Classification - Image classification

Chapter 4: Status of snow cover of Subansiri basin and its impact on the

4.5 Conclusions 96

5.1.2 Image classification

5.1.2.1 Deterministic Classification

In deterministic classification,a pixel will fall into one and only one category devoting 100%

of its information tothe class it is assigned. Deterministic classification can further be divided into four more categories. These are: (i) Manual classification, (ii) Semi-automated (supervised) digital classification, (iii) Automated (unsupervised) digital classification and (iv) Knowledge base classification.

5.1.2.1.1 Manual Classification

Manual classification of satellite data is done by delineating polygons around the pixels that belong to the same class based on color, texture, tone, pattern and shape. If the analyst has a previous knowledge of the area to be classified as well as knowledge of reflectance properties of the ground materials, then manual classification technique is considered as an effective method of classifying land cover. It can be done through on-screen digitization or classifying hard copies.

5.1.2.1.2 Automated Supervised Classification

Supervised classification can be defined normally as the process of samples of known identity to classify pixels of unknown identity (Kumar and Singh, 2013). Samples of known identity are those pixels located within training areas. In this method, the area of interest is delineated and stored for use in the supervised classification algorithm. The user defines the original pixels that contain similar spectral classes representing certain land cover class. In other words, in supervised classification, spectral signatures are collected from specified locations (training sites) in the image by digitizing various polygons overlaying different land use types. The spectral signatures are then used to classify all pixels in the scene. Generally, the supervised classification is followed by knowledge-based expert classification systems depending on reference maps to improve the accuracy of the classification process (Xioling et al., 2006 and Berberoglu et al., 2007). As the “User-Defined Polygon” function involves a high degree of user control, it reduces the chance of underestimating class variance in supervised classification (Moller-Jensen, 1997).

The basic steps involved in a typical supervised classification procedure are, I. The training stage

In this step, training fields i.e. the areas of known identity are delineated on a satellite image.

It is necessary for the analyst to know the correct class for each area. The objective is to identify a set of pixels that accurately represent spectral variation present within each information region (Kumar and Singh, 2013). The key characteristics of training area are, shape, location, number, placement and uniformity.

II. Selection of Appropriate Classification Algorithm

Various supervised classification algorithms may be used to assign an unknown pixel to one of a number of classes. Parametric classification algorithms assume that the observed measurement vectors for each class in each spectral band during the training phase of the supervised classification are Gaussian in nature, i.e.they are normally distributed.

Nonparametric classification algorithms make no such assumption. Among the most frequently used classification algorithms are the parallelepiped, minimum distance and maximum likelihood decision rules.

Parallelepiped Classification Algorithm

This is a widely used decision rule based on simple Boolean “and/or” logic. Leica (2002a) gave the basic concept of the parallelepiped decision rule. He said that in this decision rule,

the data file values of the candidate pixel are compared to the a priori set higher and lower limits of every signature in every band. When the data values of a pixel fall in between the threshold values in the signature data for each band, then that pixel is automatically assigned to the class that corresponds the signature. As shown in Fig.5.1, all the pixels inside the rectangle i.e. the parallelepipeds are the members of the same land cover class. Those pixels far from the centre and near to the corners usually do not have the same attribute as the other members of the class. Apart from that several pixels also fall outside the parallelepipeds. In such case, the analyst defines some parametric decision rules and distributes them into the available land cover classes or they are assigned as unclassified. The parallelepiped algorithm is considered as computationally efficient method of classifying remote sensing data. The disadvantage of this method is that sometimes the parallelepipeds overlap because of which chances are there that an unknown pixel might satisfy the criteria of more than one class. In such case, it is assigned to the first class for which it meets all the criteria.

Fig. 5.1: Parallelepiped Classification Minimum distance to means classification algorithm

Like the parallelepiped algorithm, it requires that the user provides the mean vectors for each class in each band from the training data. To perform minimum distance classification, a program must calculate the distance to each mean vector, from each unknown pixel. It is

possible to calculate this distance using Euclidean distance based on the Pythagorean Theorem.

Fig. 5.2: Minimum distance to mean classification Maximum likelihood classification algorithm

The basic assumption of Maximum Likelihood Classifier (MLC) algorithm is that a pixel has a certain probability of belonging to a particular class. These probabilities are equal for all classes and the input data in each band follows the Gaussian (normal) distribution function (Lillesand et al., 2004). The MLC functions by using the band means and standard deviations from field collected data in order to project land cover classes as centroids in feature space (Muzein, 2006). The Maximum Likelihood Classification tool considers both the variances and covariances of the class signatures when assigning each cell to one of the classes represented in the signature file. With the assumption that the distribution of a class sample is normal, a class can be characterized by the mean vector and the covariance matrix. Given these two characteristics for each cell value, the statistical probability is computed for each class to determine the membership of the cells to the class. When the default EQUAL a priori option is specified, each cell is classified to the class to which it has the highest probability of being a member.The advantage of Maximum Likelihood method is its use of well-developed probability theory.

Fig.5.3: Maximum likelihood classification 5.1.2.1.3 Unsupervised Classification

The unsupervised approach attempts spectral grouping that may have an unclear meaning from the user’s point of view. Unsupervised classifiers do not utilize training data as the basis for classification. Rather, this family of classifiers involves algorithms that examine the unknown pixelsin an image and aggregate them into a number of classes based on the natural groupings or clusters present in the image values. It performs very well in cases where the values within a given cover type are close together in the measurement space and data in different classes are comparatively well separated.

The classes that result from unsupervised classification are spectral classes because they are based solely on the natural groupings in the image values, the identity of the spectral classes will not be initially known. The analyst must compare the classified data with some form of reference data (such as larger scale imagery or maps) to determine the identity and informational value of the spectral classes.

There are numerous clustering algorithms that can be used to determine the natural spectral groupings present in dataset. One common form of clustering, called the “k-means” approach also called as ISODATA (Iterative Self-Organizing Data Analysis Technique) accepts from the analyst a number of clusters to be located in the data. Because the k-means approach is

iterative, it is computationally intensive. Therefore, it is often applied to image subareas rather than to full scenes.

Advantages of unsupervised classification

Advantages of unsupervised classification (relative to supervised classification) can be enumerated as follows:

a) No extensive prior knowledge of the region is required b) The opportunity for human error is minimized

c) Unique classes are recognized as distinct units Disadvantages of unsupervised classification

Disadvantages and limitations arise primarily from a reliance on “natural” grouping difficulties in matching these groups to the informational categories that are of interest to the analyst.

a) Unsupervised classification identifies spectrally homogeneous classes withinthe data;

these classes do not necessarily correspond to the information categories that are of interest to the analyst. As a result, the analyst is faced with the problem of matching spectral classes generated by the classification to the informational classes that are requiredby the ultimate user of the information.

b) Spectral properties of specific information classes will change over time (on a seasonal basis as well as over the year). As a result, relationships between informational classes and spectral classes are not constant and relationships defined for one image can seldom be extended to others.

Dalam dokumen Change in Snow Cover Area and Flow Scenario of the Brahmaputra and Subansiri Basins Due to Climate Change (Halaman 136-141)