• Tidak ada hasil yang ditemukan

GPU-based Acceleration of Interval Type-2 Fuzzy C-Means Clustering for Satellite Imagery Land-Cover Classification

N/A
N/A
Nguyễn Gia Hào

Academic year: 2023

Membagikan "GPU-based Acceleration of Interval Type-2 Fuzzy C-Means Clustering for Satellite Imagery Land-Cover Classification"

Copied!
6
0
0

Teks penuh

(1)

GPU-based Acceleration of Interval Type-2 Fuzzy C-Means Clustering for Satellite Imagery Land-Cover Classification

Long Thanh Ngo, Dinh Sinh Mai, Mau Uyen Nguyen

Department of Information Systems, Faculty of Information Technology Le Quy Don Technical University, Hanoi, Vietnam

E-mail: [email protected], [email protected], [email protected]

Abstract—When processing with large data such as satellite images, the computing speed is the problem need to be resolved.

This paper introduces a method to improve the computational efficiency of the interval type-2 fuzzy c-means clustering(IT2- FCM) based on GPU platform and applied to land-cover classification from multi-spectral satellite image. GPU-based calculations are high performance solution and free up the CPU. The experimental results show that the performance of the GPU is many times faster than CPU.

Keywords-graphics processing units, interval type-2 fuzzy c- means clustering, type-2 fuzzy sets, high performance comput- ing.

I. INTRODUCTION

Normally multi-spectral images with a capacity of a few hundred MB to several GB, so as to solve the problem of the image in addition to the problem of calculation results, the time is also an issue that needs to be considered and resolved.

Graphic Processing Units (GPUs) give a new way to per- form general purpose computing on hardware that is better suited for the complicated Fuzzy Logic Systems as solving the problem on satellite images. However, the installation of these system on the GPUs are also difficult because many algorithms are not designed in a parallel format conducive to GPU processing. In addition, there may be too many dependencies at various stages in the algorithm that will slow down GPU processing.

Type-2 fuzzy logic has been developed in theory and practice to obtain achievement for real applications [9], [10], [11]. A review of the methods used in the design of interval type-2 fuzzy clustering has been considered.

However, the complexity of IT2-FCM is still large and many researches focus to reduce these problems on the approach to algorithm or hardware implementation. Using GPUs for general purpose computing is mentioned in many researches, recently, to speed up complicated algorithms by parallelizing to suitable GPU architecture, especially for applications of fuzzy logic. Anderson et al [1] presented a GPU solution for the Fuzzy C-Means (FCM). This solution used OpenGL and Cg to achieve approximately two orders of magnitude computational speed-up for some clustering profiles using an nVIDIA 8800 GPU. They later generalized the system

for the use of non-Euclidean metrics, see Anderson et al [2]. Further, Sejun Kim et al [12] describes the method used to adapt a multi-layer trees structure composed of fuzzy adaptive units into CUDA platforms. Iurie Chiosa et al [13] presents a framework for mesh clustering solely imple- mented on the GPU with a new generic multi level clustering technique. Chia-Feng et al [15] proposes the implementation of a zero-order TSKfuzzy neural network (FNN) on GPUs to reduce training time. Harvey et al [4] presents a GPU solution for fuzzy inference. Anderson et al [2] presents a parallel implementation of fuzzy inference on a GPU using CUDA. Again, over two orders of speed improvement of this naturally parallel algorithm can be achieved under particular inference profiles. One problem with this system, as well as the FCM GPU implementation, is that they both rely upon OpenGL and Cg (graphics libraries), which makes the system and generalization of its difficult for newcomers to GPU programming. L.T. Ngo et al [16] introduced a method of speed-up interval type-2 fuzzy logic systems on GPU platforms for collision behaviour of robot navigation.

In this paper, we take advantage of the processing power of the GPU to apply solve the partitioning problem for mas- sive data satellite images based on IT2-FCM algorithm. The algorithm must be altered in order to be computed fast on a GPU. Therefore, we explore the use of nVIDIA s Compute Unified Device Architecture (CUDA) for the implementation algorithm IT2-FCM. This language exposes the functionality of the GPU in a language that most programmers are familiar with, the C/C++ language that the masses can understand and more easily integrate into applications. Experiments of land cover classification from multi-spectral satellite image are implemented to show high performance of the approach, especially, for large imagery.

The paper is organized as follows: II presents an overview on type-2 fuzzy sets, IT2-FCM; III shows land cover classification based on IT2-FCM using GPU; IV presents experimental results of study area; V is conclusion and future works.

(2)

II. PRELIMINARIES

A. Type-2 Fuzzy Sets

A type-2 fuzzy set inX isA, and the membership grade˜ of x X in A is μA˜(x, u), u Jx [0,1], which is a type-1 fuzzy set in [0, 1]. The elements of the domain of μA˜(x, u)are called primary memberships ofxinA˜and the memberships of the primary memberships in μA˜(x, u) are called secondary memberships of x inA

Definition 2.1: A type 2 fuzzy set, denoted A, is˜ characterized by a type-2 membership function μA˜(x, u) wherex∈X andu∈Jx[0,1], i.e.,

A˜={((x, u), μA˜(x, u))|∀x∈X,∀u∈Jx[0,1]} (1) or

A˜=

x∈X

u∈Jx

μA˜(x, u))/(x, u), Jx[0,1] (2) in which 0≤μA˜(x, u)1.

Type-2 fuzzy sets are called an interval type-2 fuzzy sets if the secondary membership function fx(u) = 1∀u∈Jx

i.e. an interval type-2 fuzzy set are defined as follows:

Definition 2.2: An interval type-2 fuzzy set A˜ is char- acterized by an interval type-2 membership function μA˜(x, u) = 1wherex∈X andu∈Jx[0,1], i.e.,

A˜={((x, u),1)|∀x∈X,∀u∈Jx[0,1]} (3) Uncertainty of A˜, denoted FOU, is union of primary functions i.e. F OU( ˜A) =

x∈XJx. Upper/lower bounds of membership function (UMF/LMF), denoted μA˜(x) and μA˜(x), of A˜ are two type-1 membership functions and bounds of FOU.

B. Interval Type-2 Fuzzy Clustering

IT2-FCM is extension of FCM clustering by using two fuzziness parameters m1,m2to make FOU, corresponding to upper and lower values of fuzzy clustering. The use of fuzzifiers gives different objective functions to be minimized as follows:

Jm1(U, v) =N

k=1

C

i=1(uik)m1d2ik Jm2(U, v) =N

k=1C

i=1(uik)m2d2ik (4) in whichdik=xk−viis Euclidean distance between the patternxkand the centroidvi,Cis number of clusters andN is number of patterns. Upper/lower degrees of membership, uik anduik are determined as follows:

uik =

⎧⎪

⎪⎪

⎪⎪

⎪⎪

⎪⎪

⎪⎪

⎪⎪

⎪⎩

1 C j=1

dik

djk

2/(m11) if 1 C j=1

dik

djk

< 1 C 1

C j=1

dik

djk

2/(m21) otherwise

(5)

uik=

⎧⎪

⎪⎪

⎪⎪

⎪⎪

⎪⎪

⎪⎪

⎪⎪

⎪⎩

1 C j=1

dik

djk

2/(m11) if 1 C j=1

dik

djk

1 C 1

C j=1

dik

djk

2/(m21) otherwise

(6)

in whichi= 1, ..., C,k= 1, .., N.

Because each pattern has membership interval as the up- peruand the loweru, each centroid of cluster is represented by the interval between vL and vR. Cluster centroids are computed in the same way of FCM as follows:

vi= N

k=1(uik)mxk

N

k=1(uik)m (7)

in whichi= 1, ..., C.

After obtainingviR, viL, type-reduction is applied to get centroid of clusters as follows:

vi= (vRi +vLi)/2 (8) For membership grades:

ui(xk) = (uRi (xk) +uLi(xk))/2, j = 1, ..., C (9) in which

uLi = M l=1

uil/M, uil=

ui(xk) ifxilusesui(xk)forvLi ui(xk) otherwise

(10)

uRi = M

l=1

uil/M, uil=

ui(xk) ifxil usesui(xk)for vRi ui(xk) otherwise

(11) Next, defuzzification is made as if ui(xk)> uj(xk)for j= 1, ..., C andi=j thenxk is assigned to clusteri.

III. SPEED-UPIT2-FCMBASED ONGPUANDCUDA

FORLANDCOVERCLASSIFICATION.

To install IT2-FCM algorithm on GPU platform, the first is appropriately selection of memory types and sizes. The second is to calculate two membership values UMF, LMF.

The first kernel reads from the inputs and stores on the selected global memory. We have 4 inputs from 4-channels of multi-spectral image. These samples will be loaded into an array of length N and properties for each sample are estimated. Block is used by the kernel, in part because there is a limit on the number of threads that can be created for each block (the current maximum of 512 threads per block).

The number of threads and registers and local memory requirements of the kernel are also considered to avoid

(3)

memory overflow. This information can be found for each GPU. We limit the number of threads per block is 128.

Some refinements are given for IT2-FCM to improve the quality of image classification. First, we solve the problem which is how to initialize the centroid matrix V, because initialization of matrixV can cause affect on the number of steps and results of clustering. Hence, a method to generate the matrixV to make IT2-FCM stable and efficient, called initialization of centroid algorithm, is shown to initialize centroid matrix V based on density of patterns.

The centroid will be in samples that the density surround- ing the sample data are large. This step is quite large to affect the calculation process. The concept of statistical variance mathematical model is used to solve the problem of selecting a surrounding data points.

Algorithm 1 Initialization of centroid on GPU Step 1: Compute the expected pattern zi on GPU

by the following equation:

zi= 1 N

N

j=1xjiand standard deviationsion GPU:

si = 1

N N

j=1(xji−zi)2. with i = 1,2, ..., d; X= (x1..xN),x∈Rd.

Next, copysi to host memory.

Consider the surround of each data point is m- dimensional box with radius can be defined by the standard deviation is r = min

1<i<dsi ( on CPU).

Step 2: Compute densityDiof patternxion GPU:

Di= N j=1

T(r− xj−xi) (12)

in whichT =

1 if z≥0 0 otherwise

Step 3: on GPU, find pattern xi with Di = max1≤j≤NDj

Vk=Vk∪xi andX =X\xi

Copy Vk, X to host memory.

Step 4: IfX = then go to Step 5 else back to Step 2 ( on CPU).

Step 5: Give a set of candidate points Vk (on CPU).

If Vk is large then we can proceed with this algorithm to reduce the number of candidate clusters. We can speed up calculations by dividing the input data set into subsets, then proceed to apply the algorithm for that subset, we have candidates set Vi. Then we proceed with the candidate set

∪Vi=V, then apply this algorithm to the set V.

Depending on the particular problem that we can apply the following measures to reduce the number of candidate

clusters such as the shape of the clusters that we can remove the candidates in a straight line, or by ellipses...

The centroid matrixV can be initialized by choosing the patterns inVk according to the density of candidates.

Algorithm 2 Algorithm to findvL andvR

The steps for finding vL and vR on GPU (Nota- tion: UMF(i) =μA(xi), LMF(i) =μA(xi), N=

size of data)

Step 1: Calculateθi on GPU as follows:

θi=1

2[μA(xi) +μA(xi)] (13) Step 2: Calculatec on GPU:

c =c(θ1, θ2, ..., θN) = N i=1xi∗θi

N i=1θi

(14)

Next, copyc to host memory.

Step 3: Find k such thatxk≤c ≤xk+1(on CPU) Step 4: Calculate c on GPU by following equa-

tion: In case c is used for findingyl

c= k

i=1xiμA(xi) + N

i=k+1xiμA(xi) k

i=1μA(xi) + N

i=k+1μA(xi)

(15)

In case c is used for findingyr

c= k

i=1xiμA(xi) + N

i=k+1xiμA(xi) k

i=1μA(xi) + N

i=k+1μA(xi)

(16)

Next, copyc to host memory.

Step 5: If c = c go to Step 6 else set c =c then back to Step 3 (Calculated on CPU).

Step 6: SetvL=c or vR=c (on CPU).

After obtainingvR,vL, to get centroid of clusters as follows:

vi=(vR + vL)/2;

For membership grades ui(xk) base on the for- mula (9), (10) and (11).

Next, defuzzification for IT2-FCM is made as if ui(xk)> uj(xk) for j = 1, ..., C andi! = j thenxk is assigned to cluster i(on CPU).

Secondly, the primary memberships uik and uik for a pattern with two fuzzifiers m1 and m2 are chosen by heuristic and the best results of land cover classification using IT2-FCM are shown in the experiments.

(4)

We define Ik ={i|1 i≤C, dik = 0},k = 1, N and dik is the Euclide distance between two patterns.

In case of Ik =,uik anduik are determined:

uik=

⎧⎪

⎪⎪

⎪⎪

⎪⎪

⎪⎪

⎪⎪

⎪⎪

⎪⎩

1 C j=1

dik

djk

2/(m11) if 1 C

j=1(dik/djk)

< 1 C

1 C j=1

dik

djk

2/(m21) if 1 C

j=1(dik/djk)

1 C (17)

uik=

⎧⎪

⎪⎪

⎪⎪

⎪⎪

⎪⎪

⎪⎪

⎪⎪

⎪⎩

1 C j=1

dik

djk

2/(m11) if 1 C

j=1(dik/djk)

1 C

1 C j=1

dik

djk

2/(m21) if 1 C

j=1(dik/djk)

< 1 C (18) Otherwise, if Ik=,uikanduik are determined:

uik=

0 if i /∈Ik i∈Ik

uik= 1 if i∈Ik (19)

uik=

0 if i /∈Ik i∈Ik

uik= 1 if i∈Ik (20) in which i= 1, C,k= 1, N.

Algorithm 3 Iterative algorithm IT2-FCM on CUDA Step 1: Initialization

1.1 On CPU: The two parameters of fuzzy m1, m2 (1< m1, m2),errore.

1.2 On GPU: Initialization centroidV = [vi], vi Rd byInitialization of centroid algorithm Step 2: Compute the fuzzy partition matrixU and

update centroid V.

2.1. j = j +1

2.2. On GPU: Fuzzy partition matrix Uik by formulas (17)-(20).

2.3. On CPU: Assign form data to a cluster.

2.4. On GPU: Update the centroid cluster Vj = [v1j, vj2, ..., vcj].

Step 3: On CPU, check the stop condition:

IfMax(|J(j+1)−J(j)|), go to step 4, otherwise go to step 2.

Step 4: Give the clustering results.

The detailed algorithm for land cover classification using IT2-FCM for multi-spectral satellite imagery consists of three main steps that described as algorithm 4.

Algorithm 4 land cover classification

Step 1: Multi-channels satellite image preprocess- ing.

Step 2: Perform IT2-FCM on the 4 channels im- age. This 4 channels will be default classified into six classes representing six types of land cover:

1) Rivers, ponds, lakes 2) Rocks, bare soil;

3) Fields, sparse tree;

4) Planted forests, low woods;

5) Perennial tree crops;

6) Jungles.

Step 3: Identify the percentage of the area:

Si=ni

N (21)

Si: Square of ithregion.

ni: Number of points in ithregion.

N: Total samples of 4 channels image.

The algorithm for land cover classification using IT2- FCM based on GPU for multi-spectral satellite imagery is shown in Fig.1

Figure 1. Diagram of land cover classification on GPU

IV. EXPERIMENTS

The performance implementation of the IT2-FCM on the GPU is compared with the one based on the CPU. The implementation is written in C/C++ console format and be

(5)

installed on the Microsoft Visual Studio 2005, and it is performed on the computer with the operating system is windows 7 64bit and nVIDIA CUDA support with specifica- tions: CPU is the Core i5-460M, 2.53GHz, the system has 2 GB of system RAM (DDR3). GPU is nVIDIA Gerforce GT 310M graphics card with 16 CUDA Core, 1GB of texture memory and PCI Express X16.

Test data from LANDSAT-7 imagery is study area:

Area of Hanoi ,21o54’23.11”N, 105o03’06.47”E to 20o55’14.25”N,106o02’58’.57”E.

(a) (b)

(c) (d)

Figure 2. Study data: LANDSAT7 with size 4096 x 4096. a) Channel 1;

b) Channel 2; c) Channel 3; d) Channel 4

In Figure 3, colors of classes are denoted as follows:

:Rivers, ponds, lakes; :Rocks, bare soil; :Fields, sparse tree; :Planted forests, low woods; :Perennial tree crops; :Jungles.

Table I

AREA NORTH OFHANOI: RESULT OF LAND COVER CLASSIFICATION USINGIT2-FCMONGPUFOR2CHANNELS

Class N. of pixels Percentage(%) Square(hec.) 1 1 080 081 8.606 9 715 240.429

2 886 993 7.068 7 978 429.630

3 2 071 340 16.505 18 631 534.218 4 2 582 211 20.576 23 226 777.161 5 3 376 055 26.901 30 367 339.140 6 2 553 232 20.345 22 966 113.422

The experimental results are shown in Fig. 2 - Fig. 3 for 4 channels. We take the ratio of CPU versus GPU performance. Values are less than 1 to indicate the better performance of CPU, vice versa, values are more than 1 to indicate the better performance of GPU. The CPU/GPU per- formance ratios for the IT2-FCM are given in Table III and Table IV. With a speed improvement of about 34.019 times and 45.357 times for size image2048×2048and even higher when larger input data can allow us to implement large

Table II

AREA NORTH OFHANOI: RESULT OF LAND COVER CLASSIFICATION USINGIT2-FCMONGPUFOR4CHANNELS

Class N. of pixels Percentage(%) Square(hec.) 1 1 050 371 8.370 9 448 001.404

2 918 953 7.322 8 265 907.222

3 2 279 130 18.161 20 500 588.311 4 2 386 241 19.014 21 464 043.012 5 3 179 525 25.335 28 599 567.830 6 2 735 692 21.798 24 607 326.231

(a) (b)

(c) (d)

(e) (f)

Figure 3. Result of land cover classification. a) NDVI; b) 256x256; c) 512x512; d) 1024x1024; e) 2048x2048; f) 4096x4096

Table III

CPU/GPUPERFORMANCE RATIO USINGIT2-FCMONSTUDY DATA FOR2CHANNELS

Test data 256x256 512x512 1024x1024 2048x2048 4096x4096

GPU(s) 0.095 0.297 1.044 4.310 12.309

CPU(s) 0.148 2.809 21.173 146.623 N/A

Rate 1.558 9.458 20.281 34.019 N/A

Table IV

CPU/GPUPERFORMANCE RATIO USINGIT2-FCMONSTUDY DATA FOR4CHANNELS

Test data 256x256 512x512 1024x1024 2048x2048 4096x4096

GPU(s) 0.117 0.482 1.873 7.435 20.868

CPU(s) 0.321 5.131 46.581 337.231 N/A

Rate 2.744 10.645 24.871 45.357 N/A

(6)

Figure 4. Comparisons between the result of GPU and result of CPU for 2 channels

Figure 5. Comparisons between the result of GPU and result of CPU for 4 channels

data image processing problems in practice. With image size 4096×4096 applied for IT2-FCM algorithm, the program is overload memory on the CPU, but if implemented on the GPU, the result is 12.309s and 20.806s for 2-channels and 4-channels fully be applied to solve many practical problems on the GPU. The performance will be improved more if the algorithms are implemented on better hardware of GPUs.

V. CONCLUSION

As demonstrated in this paper, the implementation of IT2- FCM on a GPU without the use of a graphics API which can be used by any researcher with knowledge of C/C++.

We have demonstrated that the CPU outperforms the GPU for small systems. For the larger data, performance of the approach on GPU is many times higher than the one on CPU.

Also, the speed depends on the computer configuration and how to organize the data in the program.

The next goal is to implement further research on hyper- spectral satellite imagery for environmental classification, assessment of land surface temperature changes.

REFERENCES

[1] Anderson, D., Luke, R., Keller, J. , Speedup of Fuzzy Cluster- ing Through Stream Processing on Graphics Processing Units, IEEE Trans. on Fuzzy Systems, Vol.16:4, 1101- 1106, 2007.

[2] Anderson, D., Luke, R., Keller, J. ,”Incorporation of Non- Euclidean Distance Metrics into Fuzzy Clustering on Graphics Processing Units,Proc. IFSA, 41, pp 128-139, 2007.

[3] Anderson, Parallelisation of Fuzzy Inference on a Graphics Processor Unit Using the Compute Unified Device Architec- ture, The 2008 UK Workshop on Computational Intelligence, UKCI 2008, pp 1-6, 2008.

[4] Harvey, N., Luke, R., Keller, J., Anderson, D., Speedup of Fuzzy Logic through Stream Processing on Graphics Process- ing Units, IEEE Congress on Evolutionary Computation, 2008.

[5] Roberto Seplveda, Oscar Montiel, Oscar Castillo, Patricia Melin: Embedding a high speed interval type-2 fuzzy controller for a real plant into an FPGA. Appl. Soft Comput. 12(3): 988- 998 (2012).

[6] Oscar Castillo, Patricia Melin: A review on the design and optimization of interval type-2 fuzzy controllers. Appl. Soft Comput. 12(4): 1267-1278 (2012).

[7] Roberto Seplveda, Oscar Montiel, Oscar Castillo, Patricia Melin: Modelling and Simulation of the defuzzification Stage of a Type-2 Fuzzy Controller Using VHDL Code. Control and Intelligent Systems 39(1):(2011).

[8] Yazmn Maldonado, Oscar Castillo, Patricia Melin: Optimiza- tion of Membership Functions for an Incremental Fuzzy PD Control Based on Genetic Algorithms. Soft Computing for Intelligent Control and Mobile Robotics 2011: 195-211.

[9] N. Karnik, J.M. Mendel and Liang Q., Type-2 Fuzzy Logic Systems, IEEE Trans. on Fuzzy Systems, 7(6), 643-658, 1999.

[10] N. Karnik and J.M. Mendel, Centroid of a Type-2 Fuzzy Set, Information Sciences, 132, pp. 195-220, 2001.

[11] Liang Q. and J.M. Mendel, Interval Type-2 Fuzzy Logic Systems: Theory and Design, IEEE Trans. on Fuzzy Systems, 8(5), pp. 535-550, 2000.

[12] Sejun K., Donald C., A GPU based Parallel Hierarchical Fuzzy ART Clustering, Proceedings of International Joint Conference on Neural Networks, vol.1, pp.2778-2782, 2011.

[13] Iurie,C., Andreas,K. GPU-Based Multi level Cluster- ing,IEEE Trans. on Visualization and Computer Graphics ,vol.17(2),pp.132-145, 2011

[14] Sharanyan S. ,Vasiya K. ,Arvind K. ,Online Accelerated Implementation of the Fuzzy C-means algorithm with the use of the GPU platform,ICCCT, 2011.

[15] Chia-F.,Teng-C., Wei-Y, Speedup of Implementing Fuzzy Neural Networks With High-Dimensional Inputs Through Par- allel Processing on Graphic Processing Units,IEEE Trans. on Fuzzy Systems,vol.19(4):717-728, 2011.

[16] Long Thanh Ngo, Dzung Dinh Nguyen, Long The Pham, and Cuong Manh Luong, Speedup of Interval Type 2 Fuzzy Logic Systems Based on GPU for Robot Navigation, Advances in Fuzzy Systems, vol. 2012, 2012.

Referensi

Dokumen terkait