Copyright is owned by the Author of the thesis. Permission is given for
a copy to be downloaded by an individual for the purpose of research and
private study only. The thesis may not be reproduced elsewhere without
the permission of the Author.
LJ
NS U . P ERV IS ED CLUSTERING 0 F SPECTRAL S I G N A T U R E S
I N
LANDSAT IMAGERY
by Brian C. Clement B.E. (Hons)
December 1977
A Thesis presented in partial fulfilment of the requirements for the degree of
Master of Arts in Computer Science at Massey University
I
ii
ABSTRACT
This thesis describes an investigation into automatic recognition of satellite imagery from the LANDSAT Project. Clustering techniques are shown to be the most suitable; of the three clustering algorithms investigated the k-means is shown to be the most effective. The need to perform edge detection on the images prior to clustering is also demonstrated. A suitable algorithm for edge detection is described.
Indexing terms: clustering, LANDSAT Satellite project, pattern recognition, Satellite data
iii
ACKNOWLEDGEMENTS
I wish to express my thanks to my p~oject supervisor, Mr K.J. Hopper, for his encouraging assistance during the preparation of this thesis, and to Dr. B.E. Carpenter for introducing me to this area of Artificial
Intelligence.
1 Introduction
1.1 Data Requirements 1.2 Remote Sensing 1.3 LANDSAT
TABLE OF CONTENTS
1.4 The Research Objective
2 Pattern Recognition 2.1 General
2.1 Choice of a Technique for LANDSAT Data 2.3 Supervised and Unsupervised Recognition 2.4 Clustering Principles
3 Current Satellite Imagery Research 3.1 Traditional Approaches 3.2 Important Aspects
4 Problem Approach
4.1 General Data Manipulation 4.2 Edge Detection
4.3 Classification
4.3.l Shared Near Neighbour 4.3.2 The Divisive Approach 4.3.3 The K-means Algorithm
5 General Data Manipulation 5.1 File Handling 5.2 Display Routines 5.3 Analysis Routines
6 Edge Detection 6.1 Fuzziness
6.2 Differentiating 6.3 Finding Boundaries
6.4 Correlating Boundary Information
1 1 2 4
5
6
6 8 8 9
12 12 14
15 15 15 16 16 16 17
20 20 21 22
28 28 28 29 31
7 The Algorithms Used
• 7.1 Shared Near Neighbour
. 7.2 MAX.FINDER - the Divisive Approach 7.3 CENTREFINDER - the K-means Approach
8 Project Assessment
8.1 Practical Problems 8.2 Boundary Points 8.3 Classification
.
8.4 Overall Success
8.5 Suggestions for Further Work
References
Bibliography
Annex A
Annex B
Annex C
Annex D
Annex E
Annex F
Annex G
Annex H
Annex I
Annex J
Annex K.
Annex L
41 41 42 44
48 48 48 49 49 49
50
52
S3
SS
57
S9
61
63
65
67
71
77
81
84
LIST OF ILLUSTRATIONS
Figure
1 The LANDSAT Multispectral Scanner 2 The K-means Algorithm
3a The output produced by ALLLEVELS for Band 7 showing the intensity levels recorded for the entire test area
3b Comparison of Image registration
4 The output produced by SHADES for Band 7.
A shaded version of fig 3a.
5 The output produced by HISTOGRAMMER for Band 7 6 The output produced by DISTRIBUTION/ANALYSER
for Bands 4 and 6 showing the correlation between the two
7 The output produced by DIFFERENTIATOR for
Sa Sb Sc. Sd
Band 7 showing the edges detected
Boundary points detected in Band 4 for T
=
10%.Boundary points detected in Band 5 for T 10%.
Boundary points detected in Band 6 for T 10%.
Boundary points detected in Band 7 for T 10%.
9a. Boundary points detected in Band 4 for T 15%.
9b Boundary points detected in Band 5 for T
=
15%.9c Boundary points detected in Band 6 for T 15%.
9d Boundary points detected in Band 7 for T
=
15%.lOa Merged boundaries for T
=
10%. Boundary points appearing in at least two of the four files(fig 8) are included.
lOb Merged boundaries for T
=
10%. Boundary points appearing in at least three of the four files(fig 8) are included.
lOc Merged boundaries for T =10%. Only boundary points appearing in all four files
(fig S) are included.
.,
Page 5 lS
23 24
25 27
27
30 32 32 33 33
34 34
35 35
37
37
3S
lla Merged boundaries for T
=
15%. Boundary points appearing in at least two of the four files(fig 9) are included.
llb Merged boundaries for T
=
15%. Boundary points appearing in at least three of the four files(fig 9) are i~cluded.
llc Merged boundaries for T
=
15%. Only boundary points appearing in all four files (fig 9) areincluded.
12a 12b 13a
Clustered output from SHAREDNN for k Clustered output from SHAREDNN for k Clustered output from CENTREFINDER
13b Clustered output from CENTREFINDER with boundary points classified
13c Ground truth for the test area A.l Structure diagram of FILEMAKER B.l Structure diagram of FLIPPER C.1 Structure diagram of ALLLEVELS D.1 Structure diagram of SHADES
E.l Structure diagram of HISTOGRAMMER
F.l Structure diagram of DISTRIBUTION/ANALYSER G.l Structure diagram of DIFFERENTIATOR .
H.1 Structure diagram of BOUNDARYFINDER (i) H.2 Structure diagram of BOUNDARYFINDER (ii) H.3 Structure diagram of BOUNDARYFINDER (iii) I.l Structure diagram of BOUNDARYMERGER (i) I.2 Structure diagram of BOUNDARYMERGER (ii) I.3 Structure diagram of BOUNDARYMERGER (iii) I.4 Structure diagram of BOUNDARYMERGER (iv) J.l Structure diagram of SHAREDNN (i)
J.2 Structure diagram of SHAREDNN (ii) J.3 Structure diagra~ of SHAREDNN (iii)
12 13
39
39
40 43 43 46
46 47
54 56 58 60 62 64 66 68 69 70 73 74 75 76 78 79 80
K.l K.2
L.l L.2
Structure diagram of MAXFINDER (i) Structure diagram of MAXFINDER (ii) Structure diagram of CENTREFINDER (i) Structure diagram of CENTREFINDER (ii)
82 83 86 87