isprs archives XLI B7 419 2016

(1)

Remote Sensing Image Classification Applied to the First National Geographical Information

Census of China

YU Xin a,b*, WEN Zongyong a,b, ZHU Zhaorong a,b, XIA Qiang a,b, SHUN Lan a,b a

Beijing Institute of Surveying and Mapping, Beijing, 100038, China, b

Beijing Key Laboratory of Urban Spatial Information Engineering, Beijing, 100038, China - [email protected]

Commission WG VII/4

KEY WORDS: Image Classification, High-resolution, Satellite Image, Pattern Recognition

ABSTRACT:

Image classification will still be a long way in the future, although it has gone almost half a century. In fact, researchers have gained many fruits in the image classification domain, but there is still a long distance between theory and practice. However, some new methods in the artificial intelligence domain will be absorbed into the image classification domain and draw on the strength of each to offset the weakness of the other, which will open up a new prospect. Usually, networks play the role of a high-level language, as is seen in Artificial Intelligence and statistics, because networks are used to build complex model from simple components. These years, Bayesian Networks, one of probabilistic networks, are a powerful data mining technique for handling uncertainty in complex domains. In this paper, we apply Tree Augmented Naive Bayesian Networks (TAN) to texture classification of High-resolution remote sensing images and put up a new method to construct the network topology structure in terms of training accuracy based on the training samples. Since 2013, China government has started the first national geographical information census project, which mainly interprets geographical information based on high-resolution remote sensing images. Therefore, this paper tries to apply Bayesian network to remote sensing image classification, in order to improve image interpretation in the first national geographical information census project. In the experiment, we choose some remote sensing images in Beijing. Experimental results demonstrate TAN outperform than Naive Bayesian Classifier (NBC) and Maximum Likelihood Classification Method (MLC) in the overall classification accuracy. In addition, the proposed method can reduce the workload of field workers and improve the work efficiency. Although it is time consuming, it will be an attractive and effective method for assisting office operation of image interpretation.

* Corresponding author

1. INTRODUCTION

On February 28th, 2013, the State Department of China started a national project, the first national geographical information census, which is to achieve the latest geographical information of land cover and land use all over the nation based on high-resolution remote sensing images from 2013 to 2015. The State Department of China authorized national administration of Surveying, Mapping and Geo-information to implement the project at the national level and at the same time, local government authorized local administration of Surveying, Mapping and Geo-information to implement it at the local level. Last year, firstly local administration of Surveying, Mapping and Geo-information submitted the project results to local government and then the local government submitted it to national administration of Surveying, Mapping and Geo-information. After national administration of Surveying, Mapping and Geo-information checked and accepted, it was finally submitted to the State Department of China. In this project, there are ten first level classes, forty second level classes and seventy seven third level classes of geographical information of land cover. Because different office interpreters in the project are familiar with different classes and master the interpretation features of different classes, it is very difficult for office interpreters to master the interpretation features of all classes and more difficult to exactly determine the class attribute among those easily confused classes based on high-resolution remote sensing images. However, in this case, office interpreters can differentiate the border of among different

classes based on high-resolution remote sensing images, although they are not able to determine the class attribute. Thus, through office operator’s description, we can get some irregular polygons that are not able to determine the corresponding class attribute. In this paper, we vies these irregular polygons on the high-resolution remote sensing images as objects, which are extremely similar to the concept of object achieved automatically through some kind of image segment algorithm in the image processing field. Nevertheless, the object achieved through the description of office operates is more accurate and meaningful than the object achieved automatically through the image segment algorithms. Therefore, this paper puts up a new method of objected image classification based on Bayesian networks and high-resolution remote sensing images to help office interpreters to determine what class the object belongs to. The experimental results show that the proposed method can improve the efficiency of image interpretation for office interpreters and to some extent the method can reduce the workload of field workers.

This paper is organized as follows. In section 2, we review some basic concepts of Tree Augmented Naive Bayesian Networks (TAN) and then we introduce how to apply Bayesian networks to land cover image classification in section 3. Then in section 4 we describe the experiments and test on High-resolution remote sensing images based on TAN. Finally Section 5 draws some conclusions.

The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLI-B7, 2016 XXIII ISPRS Congress, 12–19 July 2016, Prague, Czech Republic

This contribution has been peer-reviewed.

(2)

2. BAYESIAN NETWORKS

In this section, we simply introduce some basic concepts about Bayesian Networks and then apply it to land cover classification of High-resolution remote sensing images.

2.1 Basic concepts

The concept of Bayesian networks was put up by Pearl in 1986, but at that time it was not paid attention to until researchers found good performance when Naive Bayesian network applied to image classification (Friedman, 1997). In fact, Bayesian networks are a combination of Bayesian Probability and Graph Theory. On one hand, it qualitatively the dependent relationship among nodes or variables intuitively in the view of graph theory and on the other hand, it quantitatively describe the corresponding dependent relationship among nodes in the view of Bayesian probability.

Figure 1. An example of Bayesian network

In Figure 1 (David Heckerman., 1997), Burgary node, Earthquake node, JohnCalls node, MarryCalls node and Alarm node represent five nodes or five random variables. The directed edge from Burgary node to Alarm node represents the dependent relationship between Burgary node and Alarm node. Furthermore, Burgary node is the parent node of Alarm node and conversely Alarm node is the children node of Burgary node. Again, Alarm node is the Earthquake node of Burgary node and is also the parent node of JohnCalls node and MarryCalls node. Accordingly, the conditional probability in the figure 1 represents the corresponding dependent relationship among those nodes. For example, the table in the middle of figure 1, the second row represents that the probability that Alarm node would happen is equal to 0.95 in the condition that the two nodes, Burgary node and Earthquake node, happen at the same time. In addition in the first condition probability table of figure 1 (left top) represents the prior probability of Burgary node, the probability that Burgary node happens without any other known information.

2.2 Inference Model

There are some kinds of Bayesian networks according to the structure and complexity of Bayesian network, among which the Naive Bayesian network is the simplest one. It assumes that all children node to the parent node are conditional independent although the conditional independent hypothesis is almost not met in the reality. Considering that the features extracting from high-resolution remote sensing images are always relevant, we select Tree Augmented Naive Bayesian Networks (TAN) to study object-oriented image classification and the inference

formula is as follow below. The details of the inference model may reference to the reference (YU Xin, 2007).

1 2 1 2

(

_i

|

,

_n

)

(

,

_n

,

_i

)

P C X X

X



P X X

X C

(1)

Where

C

_i different classes

X

=different features extracting from high-resolution remote sensing images.

3. BAYESIAN NETWORKS APPLIED TO LAND COVER IMAGE INTERPRETATION

3.1 Feature description and extracting

Feature description and extracting play an important role in the image analysis and understanding. Usually, texture features are extracted and they are divided into statistical texture feature and structural texture feature. In the aspect of statistical texture feature, according to Gray-Level Co-occurrence Matrix, this paper extracts angular second moment, entropy and Inverse Difference Moment. The corresponding formulas are below.



 details reference to the reference (YU Xin, 2008).

3.2 Application case

Figure 2 is an example of Bayesian networks applied to land cover image classification.

Figure 2. Bayesian network applied to image classification

In the figure 2, the node C is a attribute class node (or variable) and Xi describes one certain kind of texture feature extracting from some classification unit or object, which may be statistical texture feature or structural texture feature. According to the

definition of Bayesian network,

P C

_a

( )

 

，

oriented image classification, the mainly processing steps are as follows below.

randomly select some samples as train samples from sample database;

(3)

extract the statistical texture feature and structural texture feature based on the train samples;

learn Bayesian network’s parameters based on prior information and train sample information;

randomly select some samples as test samples from sample database and extract the corresponding texture features same as train samples;

calculate the posterior probability of each test sample based on the inference model;

determine the class attribute based on the maximum posterior probability principle;

evaluate the classification results based on confused matrix to achieve classification accuracy.

4. EXPERIMENTS AND ANALYSIS

In this section, we simply introduce experimental data and results and analysis.

4.1 Experimental data

The experimental data in this paper came from the remote sensing images in the first national geographical information census project. There are colourized remote sensing images, which are achieved in fall, 2013. The spatial resolution of remote sensing images in the plain area is 0.19 meters and the spatial resolution of remote sensing images in the mountain area is 0.48 meters. In fact, the three classes, coniferous forest, broad-leaved forest and mixed broadleaf-conifer forest, are most confused and most difficult for office interpreters to distinguish based on coloured high-resolution remote sensing images. Thus, in order to show the performance of the proposed method, we only consider the three classes in the experiments. An example of each class is displayed below. In the figure 3 (An sample of each class), the red scope represents the border of one certain class, which can be manually described by office interpreter.

(a) broad-leaved forest (b) coniferous forest

(c) mixed broadleaf-conifer forest Figure 3. An sample of each class

4.2 Results analysis

During the land cover interpretation, different numbers of train samples and classification methods both have an influence on

classification accuracy to some extent. Thus, in this experiments, we choose maximum likelihood classification method (MLC), which is most usual in the image classification field, and Naive Bayesian network Classification method (NBC), which is the simplest method of Bayesian networks, to compare with the proposed method (Tree Augmented Naive Bayesian Networks ,TAN) based on the classification accuracy. In addition, we test classification accuracy in the condition of different numbers of train samples, which changes from 5 to 30. The experimental results of classification accuracy are listed in the following table 1.

N 5 10 15 20 25 30

MLC 78.94 80 82.35 87.5 86.7 85.7 NBC 82.11 82.22 85.88 87.5 88 87.1 TAN 83.16 83.33 88.24 89.4 92 91.4 Table 1. Comparisons based on classification accuracy

In table 1, N in the first row represents the number of train samples (from 5 to 30) and the first column represents different classification method (MLC, NBC and TAN). From the above table, we can draw some conclusions below.

(1) The proposed method is effective and efficient;

(2) As the number of train sample increases, the classification accuracy of the three method all increase. But when the number of train sample is over 20, the classification of the three method become relatively stable.

(3) Among the three methods, the two methods (MLC and NBC) are almost the same in the view of classification accuracy, but their classification accuracy are not as good as the classification accuracy of the proposed method.

5. CONCLUSIONS

During the national geographical information census in China, considering that it is easy for office interpreters to describe the border among different classes, but it is difficult for office interpreters to determine the class attribute, we regard irregular polygon within the border described manually by office interpreters as one object or one minimum classification unit and put up a new method of object-oriented land cover image classification based on Bayesian networks and high-resolution remote sensing images to help office interpreters to recognize or determine the class attribute. Experiments and analysis show that the proposed method is effective and efficient, and it performs better than MLC and NBC. In addition, the proposed method can reduce the workload of field workers and improve the work efficiency.

ACKNOWLEDGEMENTS

This paper is financially supported by National Natural Science Foundation of China Project (No. 41401499), Beijing Science Foundation of China Project (No. 4154073), 2016 Beijing Nova Program Project and Beijing Key Laboratory of Urban Spatial Information Engineering Project (No. 2015204). The authors wish to thank the anonymous reviewers for the comments and suggestions on this paper.

REFERENCES

Baatz M, Schäpe A. Object-oriented and multi-scale image analysis in semantic networks[C]. Proceedings of the 2nd International Symposium on Operationalization of Remote Sensing, Enschede, ITC, 1999:16-20.

(4)

David Heckerman., 1997. Bayesian Networks for Data Mining. Data Mining and Knowledge Discovery 1, pp.79~119.

Friedman, N., Geiger, D., Goldszmidt, M., 1997. Bayesian Network classifiers, Machine Learning, vol.29, pp.131~163.

Yu Xin, Zheng Zhaobao, LI Linyi, Ye ZhiWei., 2005. Texture Classification of High-resolution satellite images based on PCA-NBC, Proceedings of SPIE - The International Society for Optical Engineering, v 6044, MIPPR 2005: Image Analysis Techniques.

Jiebo Luo, Andreas E.Savakis, Amit Singhal., 2005. A Bayesian network-based framework for semantic image understanding. Pattern Recognition 38, pp.919~934.

Jinn Ho, Wen-Liang Hwang, "Wavelet Bayesian Network Image Denoising[J]," IEEE Transactions on Image Processing, volume 22, number 4, pages 1277 – 1290，April 2013. Jerónimo Hernández-González, Author Vitae, Iñaki Inza Author Vitae, Jose A. Lozano. Learning Bayesian network classifiers from label proportions[J]. Pattern Recognition. Volume 46, Issue 12, December 2013, Pages 3425–3440.

L. Enrique Sucara, Concha Bielzab, Eduardo F. Moralesa, Pablo Hernandez-Leala, Julio H. Zaragozac, Pedro Larrañaga. Multi-label classification with Bayesian network-based chain classifiers[J]. Pattern Recognition Letters. Available online 20 November 2013.

Marina Velikovaa, Peter J.F. Lucasa, b, Maurice Samulskic, Nico Karssemeijer. On the interplay of machine learning and background knowledge in image interpretation by Bayesian networks[J]. Artificial Intelligence in Medicine. Volume 57, Issue 1, January 2013, Pages 73–86.

Ritaban Duttaa, Author Vitae, Anthony G. CohnbAuthor Vitae, Jen M. Muggletonc. 3D mapping of buried underworld infrastructure using dynamic Bayesian network based multi-sensory image data fusion[J]. Journal of Applied Geophysics. Volume 92, May 2013, Pages 8–19.

X Wu, X Wen, J Li, L Yao. A new dynamic Bayesian network approach for determining effective connectivity from fMRI data[J]. Neural Computing and Applications. 2014(24):P.91-97.

Xin YU，Bogang YANG， Pingxiang CHEN， Guangjun JIA，Haitao ZHANG，Texture Classification of High-resolution satellite images Based on Bayesian Network Augmented Naive Bayes[C]，XXII International Society for Photogrammetry & Remote Sensing , Melbourne， 2012.08.25-09.01.

Yu Xin, Zheng Zhaobao, YE Zhiwe, TIAN Liqiao., 2007. Texture Classification Based on Tree Augmented Naive Bayes Classifier, Geomatics And Information Science of Wuhan University, Vol.32, No.4., p. 287~289.

YU Xin, ZHENG Zhaobao, ZHANG Haitao, YE Zhiwei, 2008. Texture Classification of High-resolution satellite images based on Bayesian network augmented naïve Bayes[C]. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. Vol. XXXVII. Part B7. Beijing, pp.497-501.