T2 972014001 Full text

(1)

Detection Model of Landslide-Potential Areas based on

Local-Learning using Iterative Dichotomiser Three

Algorithm

Tesis

Diajukan kepada

Fakultas Teknologi Informasi

untuk Memperoleh Gelar Master of Computer Science

Oleh:

Yerymia Alfa Susetyo

NIM: 972014001

Program Studi Magister Sistem Informasi

Fakultas Teknologi Informasi

Universitas Kristen Satya Wacana

Salatiga

(2)

(3)

(4)

(5)

(6)

vi

Kata Pengantar

Puji syukur kepada Tuhan Yesus Kristus, atas kasih karunia yang telah

dikaruniakan sehingga penulis dapat kuat dan semangat dalam menyelesaikan pengerjaan

tugas akhir ini.

Ucapan terima kasih penulis sampaikan kepada pihak-pihak yang telah membantu

dalam penelitian ini:

1. Bapak Dr. Dharmaputra Palekahelu, M.Pd., selaku dekan Fakultas Teknologi

Informasi, Universitas Kristen Satya Wacana.

2. Bapak Prof. Ir. Danny Manongga, M.Sc, Ph.D, selaku Ketua Program Studi

Magister Sistem Informasi dan dosen pembimbing 1 atas bimbingan, arahan,

dan semangat yang telah diberkan selama masa pengerjaan tesis ini.

3. Bapak Prof. Dr. Ir. Wiranto Herry Utomo, M.Kom, selaku dosen pembimbing

2 atas bimbingan, dan masukan yang telah diberikan dalam penulisan tesis ini.

4. Badan Nasional Penanggulangan Bencana (BNPB) Republik Indonesia atas

data kejadian longsor yang telah diberikan, sehingga bisa bermanfaat dalam

penelitian ini

5. Segenap dosen dan karyawan Fakultas Teknologi Informasi, Universitas

Kristen Satya Wacana yang memberikan banyak bantuan kepada penulis

selama menuntut ilmu di FTI UKSW

6. Segenap keluarga, kerabat, dan teman-teman yang telah mendukung dalam

setiap kebersamaan.

7. Semua pihak yang telah membantu dalam penelitian ini baik secara langsung

maupun tidak langsung.

Terakhir, semoga penelitian ini bermanfaat bagi pemerintah dan

masyarakat. Penulis mohon maaf apabila terjadi kekeliruan dalam penelitian ini,

saran dan masukan akan sangat bermanfaat.

Salatiga, 5 Oktober 2016

(7)

vii

Daftar Isi

Halaman Judul

...

i

Pernyataan Tidak Plagiat

...

ii

Pernyataan Persetujuan Akses

...

iii

Lembar Persetujuan Pembimbing

...

iv

Lembar Pengesahan

...

v

Kata Pengantar

...

vi

Daftar Isi

...

vii

Daftar Gambar

...

viii

Daftar Tabel

...

ix

Abstract

...

1 I. Introduction

...

1 II. Related Works

...

2 III. Proposed Method

...

2 IV. Result and Test

...

4 V. Discussion

...

6 VI. Conclusion

...

7 References

...

7

(8)

viii

Daftar Gambar

(9)

ix

Daftar Tabel

Table 1 Table Structure of Detection Model of Landslide

...

3 Table 2 Discrete Value of Triggering Attributes of Landslide

...

3 Table 3 Confusion Matrix

...

4 Table 4 Entropy Value and Gain Value in the First Iteration

...

5 Table 5 Rules Generated from Model

...

5 Table 6 Confusion Matrix Results

...

6

(10)

(IJCSIS) International Journal of Computer Science and Information Security, Vol. 14, No.09, September 2016

1

Detection Model of Landslide-Potential Areas based

on Local-Learning using Iterative Dichotomiser Three

Algorithm

Abstract—Landslide is the most destructive natural disaster since it causes very significant environmental and socioeconomic damages. Java, Indonesia is the most densely populated island in the world. High population density and careless land conversion lead to frequent landslides. Landslide itself is the most frequent natural disaster in Indonesia. This research aims to develop an early warning model of landslide-potential areas based on local-learning that suits local geographical conditions using Iterative Dichotomiser Three (ID3) in Java as the most landslide-prone area in Indonesia. We analyze and map landslide data with climate and soil characteristics using ID3 algorithm. In this research, we utilize landslide-causing attributes i.e. area slope, rainfall, soil type, and land cover. This research produce 36 leaf-node decision tree, where 19 leaf-node

indicate “Landslide-potential” and 17 leaf-node points to “Not Landslide-potential”. Furthermore, the accuracy level of this model is 92.37% with land cover attribute is the main attribute that trigger landslide.

Keywords— ID3 Algorithm; Landslide; Land Use; Learning Algorithm; Local Geographic

I. INTRODUCTION

Landslide is the most destructive natural disaster since it causes very significant environmental and socioeconomic damages [1]. It frequently occurs in various parts of the world, especially in developing countries due to poor land use plan, expanding human settlement areas, careless land conversion practices, and climate change [2]. As a developing country, Indonesia suffers frequent natural disasters and landslide is one of the most frequent natural disasters. Data from Indonesian National Disaster Prevention Agency show that in December 2014 landslide was the most frequent natural disaster. More specifically, there were 111 incidents of landslide, much more frequent than flood as the second most frequent disaster (86 incidents). There were 12 provinces suffered landslide, with Central Java, West Java,

and East Java have the highest number of landslide incidents [3].

Landslide-potential area is defined as the area that shows landslide tendencies. The occurrence of a landslide in a particular area can be related with similarities of area and climate characteristics in other areas with previous landslide incidents. It is then expected that developing an early-warning system in landslide-prone areas helps identify other areas with similar climate and soil physical characteristics as landslide-potential areas [4].

A model that can detect landslide-prone areas is a decision-tree learning algorithm that is derived from machine learning [5]. Rules of this decision tree are appropriate tools to predict a condition based on various variables [6]. This method can model relation among variables without having to stick to rules of data distribution or weighting. Besides, it is no longer necessary to have specific rules for data format. Data can take form of numbers or scale [7]. Iterative Dicotomiser Tree (ID3) is a decision tree that can handle continuous attributes by going through the discretization process [8].

(11)

2

II. RELATED WORKS

The soft computing literature has produced mapping and detection model of landslide-potential areas by combining triggering factors of landslide. Statistical method is the most commonly used method to predict landslide. In Trabzon, Turkey, two statistical methods, Multi-Criteria Decision Making (MCDM) and Support Vector Regression (SVR), are combined to predict landslides. In MDCM, one has to firstly determine weights of each attribute that trigger landslides. The attribute with the highest weight indicates that this attribute is the most influential [1]. However, statistical methods have serious flaws in determining the weights of each attribute. More specifically, it is likely that statistical methods will produce different results [5].

Decision-tree learning algorithm is another model that can be used to detect natural disasters. In a research conducted in Kelantan, Malaysia, this method can predict flood with accuracy level as high as 87%. It also claims that learning algorithm outperforms statistical methods in the sense that learning algorithm does not have to rely on statistical assumptions and can handle differences of weighting scale [5].

Another research of landslide detection that was conducted in Penang, Malaysia, compares the accuracy level of four types of learning algorithm, i.e. CHAID, Exhaustive CHAID, CRT, and QUEST. These four methods can produce considerably high accuracy level of 74%-82% [4]. Meanwhile, the statistical methods (MCDM, SVR, and LR) in the Trabzon study only produces accuracy level of 69%-77% [1].

Iterative Dichotomiser Three (ID3) is a form of learning algorithm. Another research indicates that ID3 algorithm is an algorithm that works well in non-continuous or discrete data [13]. Data of triggering attributes of landslide that are released by Indonesian government are discrete or interval spatial data. Based on these arguments, this research aims to develop an early warning model of landslide-potential areas based on local-learning using Iterative Dichotomiser Three (ID3) in Indonesia, especially in Java as the area with most frequent landslides.

III. PROPOSED METHOD

Figure 1 shows stages of Detection Model of Landslide-Potential Areas based on Local-Learning using ID3 algorithm. This research consists of four stages i.e. (a) the first stage collecting spatial data of landslide incidents and triggering attributes of landslides; (b) the second phase preprocessing to convert spatial data into discrete data; (c) the third stage developing the algorithm model to learn landslide-potential areas with ID3; and (d) the fourth stage testing the model accuracy using confusion matrix.

Fig. 1. Stages of Detection Model of Landslide-Potential Areas based on Local-Learning using ID3 Algorithm

A. Data Collection

Based on the decree of Minister of Public Work Republic Indonesia No 22/PRT/M/2007 about guidelines of land use planning of landslide-potential areas, we use four variables or attributes that trigger landslide, i.e. land slope, rainfall, land cover, and land type [9]. We obtain the data from related authoritative agencies in the form of spatial data. Additionally, we also use data of landslide incidents in the three Java provinces from 2011 to 2015 from Indonesian National Disaster Prevention Agency. Spatial data of landslide incidents in the three Java provinces shown in Figure 2.

B. Data Preprocessing

(12)

3

Fig. 2. Spatial Data of Landslide Incidents in Java from 2011 – 2015 Issued by Indonesian National Disaster Prevention Agency

TABLE1 converted into interval or discrete label (discretization). We classify discrete rainfall and slope data using scale of 1000 and 15, respectively. We do not convert land cover data. Meanwhile, we group land type data based on land damage potential according to Ministry of Environment Republic of Indonesia [11]. Table 2 shows the discrete value of triggering attributes of landslide.

The last step of data preprocessing is classifying data into two groups, i.e. data training and data testing. Data training forms model of 80% from overall data, while data testing tests model accuracy of 20% from overall data.

C. Detection Model of Landslide-Potential Areas with ID3 Iterative Dichotomiser Three (ID3) is an algorithm that is specially used for learning algorithm. This algorithm develops a classification tree model. ID3 algorithm is a classification algorithm that is developed based on entropy value, i.e. evaluation all existing attributes to identify the influence level of an attribute in classifying data sample using a particular measure that is commonly known as information gain [12].

Entropy is a parameter to measure heterogeneity of a set of data sample. The more heterogeneous a set of sample data, the higher is the entropy value [13]. Mathematically, entropy can be formulated as follows:

Entropy (S) =

∑

ci - pi log2 pi (1)

c is number of values in the target attribute, and pi is number

of sample for class i. After generating entropy value of a set of sample data, the influence or effectiveness level of an attribute in classifying data can be measured. This effectiveness measure is labeled as information gain [13]. Mathematically, information gain of an attribute A can be formulated as follows:

Gain (S, A) = Entropy (S) -

∑

v  values (A)(|Sv|/|S|)

(13)

4

Entropy (S) : entropy for sample data

In order to develop a detection model of landslide-potential area, ID3 algorithm can be implemented using recursive function (a self-retrieving function). The ID3 algorithm to detect landslide potentials is as follows:

Algorithms 1 detection model of landslide-potential area by ID3 algorithm

ID3 (Z, Attributes, Target) 1. p = createNode ()

2. label (p) = mostCommonClass (Z, Target)

3. IF(x, c(x))  Z : c(x) = c THEN return (p) ENDIF

4. IF Attributes = THEN return (p) ENDIF

5. K* = argmaxAAttributes (informationGain (Z,K))

6. FOREACH a  A* DO

Za = {(x, c(x))  Z : x|K* = a}

IF Za = THEN

p' = createNode()

label (p') = mostCommonClass (Z, Target) createEdge (p, a, p')

ELSE

createEdge (p,a, ID3 (Za, Attributes \ {K*}, Target))

ENDIF ENDDO

7. return (p)

In this algorithm, line 1 and 2 are early initialization to form a node with its label in the decision tree. The third step

classifies data based on their labels (the “Landslide-potential”

and “non Landslide-potential” groups). Point 4 indicates that if there is no attribute given, the decision tree will end with single node. On the contrary, if there exists attribute given, the fifth step seeks the best classifier by finding the highest information gain. After generating the best classifier attribute, step 6 exhibits iteration to form branches placed below root (the best classifier). This step also checks attribute value on label groups. If one of attributes do not have sample value, iteration will step and its last knot and decision (label) will be formed. On the contrary, if the sample value still exists, the function of ID3 recursive will be retrieved to enter each iteration.

D. Testing the Accuracy of Detection Model of Landslide-potential Areas

In order to test the accuracy level of classification and prediction in the model, we firstly develop Confusion Matrix [14]. Confusion Matrix measures performance of two decisions models produced as shown in Table 3. In Table 3,

TP (True Positive) refers to number of data predicted to be YES while in fact it is YES, FN (False Negative) is the number of data predicted to be YES while in fact it is NO, FP (False Positive) points to the number of data predicted to be NO while in fact it is YES, and TN (True Negative) is the number of data predicted NO when in fact it is NO.

TABLE3 determine the accuracy level is as follows [14]:

Accuracy = (TP + TN) / (TP + FP + TN + FN) (3)

Where TP, FP, TN, and FN are generated from confusion matrix.

IV. RESULTS AND TESTS

Our results as shown in Figure 3, consist of three points, i.e. data preprocessing results, decision tree to detect landslide-potential areas with ID3, and model accuracy test.

Fig. 3. Model of Detection System of Landslide-Potential Areas

(14)

5 two groups, i.e. data training and data testing. We utilize 471 data from January 2011 to December 2014 (354 data with landslides and 117 without landslides) for data training. For data testing, we use 20% of total sample data that comprise of 118 data (75 data with landslides and 43 data without landslides). We use landslide incident data from January 2015 to June 2015 for data testing activities.

We use data training to develop the decision tree. Firstly, we measure information gain or effectiveness level of each attribute on landslide incidents. We use entropy value and information gain of each attribute to determine the best classifier or root of decision tree, as shown by Table 4. The formula to generate overall entropy value of data training is shown in (1) below:

Rainfall E(S Low) 0.838008 Gain(S, Rainfall) =

E(S Medium) 0.733302 0.152526

E(S High) 0.454701

Slope E(S Flat) 0.991526 Gain(S, Slope) =

E(S Corrugated) 0.836641 0.091311

E(S Steep) 0.477429

Soil Type E(S Light) 0.994030 Gain(S, Soil Type) =

E(S Medium) 0.622896 0.024706

E(S High) 0.852682

Land Cover E(S Forest) 0.439497 Gain(S, Land Cover) =

E(S Field) 0.937963 0.284734

E(S Farming Estate) 0.739481

E(S Dry Farming ) 0.277289

E(S Settlement ) 0.543564

As shown by Table 4, Land Cover has the highest information gain value (0.284734); implying that Land Cover is the best classifier and positioned as root of the decision tree. In the second iteration, the best classifiers are located below Land Cover knot for each branch value. Rainfall (gain 0.208) is located below Forest value, while Slope (0.177) is located below Farming Estate. Further, Rainfall (gain 0.121) is located below Dry Farming value and Slope (gain 0.701) is located below Field value.

After performing the fourth iteration and fully developing the decision tree structure, we generate rules to detect landslides as can be seen at Table 5. This research produces 36 leaf-node, where 19 of them are “Landslide

-potential” and the rest are “non Landslide-potential”.

TABLE5

RULES GENERATED FROM LANDSLIDE DETECTION MODEL BASED ON

LOCAL-LEARNING WITH ID3ALGORITHM IN JAVA ISLAND

Node Landslide-Potential Rules Produced

1 IF (Land Cover=’Forest’ AND Rainfall=’High’ AND Slope=’Corrugated’ AND Soil Type=’High’) THEN ‘Landslide-Potential’

2 IF (Land Cover=’Farming Estate’ AND Slope=’Corrugated’ AND Rainfall =’Medium’) THEN ‘Landslide-Potential’ 3 IF (Land Cover=’Farming Estate’ AND Slope=’Corrugated’

AND Rainfall =’High’ AND Soil Type=’Medium’) THEN ‘Landslide-Potential’

4 IF (Land Cover=’Farming Estate’ AND Slope=’Steep’ AND Rainfall =’Medium’) THEN ‘Landslide-Potential’

5 IF (Land Cover=’Farming Estate’ AND Slope=’ Steep’ AND Rainfall =’High’) THEN ‘Landslide-Potential’

6 IF (Land Cover=’Farming Estate’ AND Slope=’Flat’ AND Rainfall =’High’ AND Soil Type=’Medium’) THEN ‘Landslide-Potential’

7 IF (Land Cover=’Farming Estate’ AND Slope=’Flat’ AND Rainfall =’High’ AND Soil Type=’High’) THEN ‘Landslide-Potential’

8 IF (Land Cover=’Settlement ’ AND Rainfall =’Low’ AND Slope=’Corrugated’) THEN ‘Landslide-Potential’ 9 IF (Land Cover=’Settlement ’ AND Rainfall =’Low’ AND

Slope=’Steep’) THEN ‘Landslide-Potential’

10 IF (Land Cover=’Settlement ’ AND Rainfall =’Medium’) THEN ‘Landslide-Potential’

11 IF (Land Cover=’Settlement ’ AND Rainfall =’High’) THEN ‘Landslide-Potential’

12 IF (Land Cover=’Dry Farming’ AND Rainfall =’Low’ AND Soil Type=’Light’ AND Slope=’Steep’) THEN ‘Landslide-Potential’

13 IF (Land Cover=’Dry Farming’ AND Rainfall =’Low’ AND Soil Type=’High’) THEN ‘Landslide-Potential’

14 IF (Land Cover=’Dry Farming’ AND Rainfall =’Medium’) THEN ‘Landslide-Potential’

15 IF (Land Cover=’Dry Farming’ AND Rainfall =’High’ AND Soil Type=’Medium’) THEN ‘Landslide-Potential’ 16 IF (Land Cover=’Dry Farming’ AND Rainfall =’High’ AND

Soil Type=’High’) THEN ‘Landslide-Potential’ 17 IF (Land Cover=’Field’ AND Slope=’Corrugated’ AND

Rainfall =’High’) THEN ‘Landslide-Potential’ 18 IF (Land Cover=’Field’ AND Slope=’ Steep’) THEN

‘Landslide-Potential’

19 IF (Land Cover=’Field’ AND Slope=’Flat’ AND Rainfall =’High’ AND Soil Type=’High’) THEN ‘Landslide-Potential’

(15)

6 0.9237 or 92.37%. We measure our accuracy level based on (2):

Accuracy = (72 + 37) / (72 + 6 + 37 + 3) = 0.9237

V. DISCUSSION

ID3 is the appropriate algorithm to develop decision tree in detecting landslide because the decision tree construction produced by this algorithm is based on data of triggering attributes of landslides. For this research, we use discrete and interval data from government agencies. This is consistent with previous research that suggests that ID3 algorithm works well in non-continuous or discrete data [13].

This algorithm maps previous landslide incidents into a decision tree structure. The landslide incident in a particular area can be related with similar soil or climate characteristics of other areas that experienced landslide previously [4].

Consequently, when the decision tree’s decision is “Landslide” based on previous incidents, this research

converts the decision into “potential” that suggests that the

area will potentially experience landslide in the future. Also,

the decision of “no Landslide” was converted into “not potential” that implies that this area has low potentials of

landslide in the future.

Decision tree can also be used to analyze relationship between attributes and landslide incidents [4]. The decision tree presents the results of learning algorithm in hierarchical structure where attributes in the highest sequence is the most important attribute in influencing landslide potentials [4] [15]. We determine attribute with highest sequence (root) by referring to the highest information gain value in the first iteration [16]. Table 4 shows the information gain value or influence value of each attribute in the first iteration. More specifically, Land Cover attribute has the highest influence as indicated by its information gain of 0.284734. Consequently, Land Cover attribute is located at root or the highest position as this attribute has the most influence on the landslide incident. Our results suggest that poor land use planning in Java Island is the main reason of various natural disasters such as sedimentation, erosion, landslides, and diminishing water availability. Land use planning practices often neglect environmental sustainability [17].

The second iteration of the decision tree produces Rainfall and Slope attributes as the most influential attributes after Land Cover. High rainfall in a particular area considerably influence the landslide potentials, especially in settlement area. High rainfall will potentially trigger landslide

if this settlement area is located in the corrugated and steep slope. Meanwhile, in the field and farming estate areas, slope level is sufficiently influential in explaining landslide incident, especially in corrugated and steep slope.

Land Cover is the only attribute that can be controlled by social activities or government policies through regulation. Globally, many areas frequently experience landslides, especially those located in developing countries. One of main causes of these incidents is poor land and spatial use planning and expanding human settlement areas, careless land conversion practices, and climate change [2]. Among various attributes, land cover is the most sensitive to environmental change and human activities. Therefore, management of landslide-potential areas has to take land cover structure into account [1].

Our results could serve as an early warning for society, government, and private (developers), especially in planning land use in three provinces that are among the most densely populated and most prone to landslides, i.e. West Java, Central Java, and Jogjakarta Special Region [3].

The ID3 algorithm is an algorithm that can learn local conditions of an area. Compared to similar research that relies on decision tree, our results indicate some differences. Research on modelling of landslide modelling using Chi-square Automatic Interaction Detector (CHAID) decision tree in Penang-Malaysia suggests that slope is the most influential attribute [4]. These results are understandable since Penang is dominated by areas with corrugated and steep slopes. Only 43.28% of Penang area is flat. Further, Penang has significant portions of forest and (fruit) farming estate areas – much higher than our areas of research. Forest area could catch a significant amount of water, thus securing the water flow and sustaining slope stability [18]. Other research that use decision tree in Hoa Binh, Vietnam shows that distance between landslide points and streets as the most influential attributes of landslide disaster [15]. These results are similar to ours since street development is a part of land cover. Street and settlement land covers could disrupt natural topology and affect slope stability [15].

Further, ID3 algorithm method has the accuracy level of 92.37%. Table 7 shows comparison of accuracy level between ID3 algorithm and other methods in similar research.

TABLE7

COMPARISON OF ACCURACY LEVELS OF VARIOUS LANDSLIDE DETECTION

METHODS

Method Groups Method Name Accuracy

Decision Tree ID3 (This research) 92.37 %

CHAID 81.90 %

Exhaustive CHAID 82.00 %

CRT 75.60 %

QUEST 74.00 %

Non-Decision Tree GIS-MCDA 77.49 %

SVR 75.12 %

LR 69.41 %

(16)

7 weighting but local-learning of previous incidents in a particular area. In this table, excluding ID3 method, the Exhaustive CHAID method has the highest accuracy (82%) for Decision Tree method [4], while for Non-Decision Tree methods, the GIS-MCDA method has the highest accuracy of 77.49% [1]. Meanwhile, our research shows that the ID3 method has the accuracy level of 92.37%, implying that this method performs better in developing landslide detection model as indicated by its highest accuracy level.

VI. CONCLUSION

This research models landslide detection based on local-learning using ID3 algorithm in three provinces of Central Java, West Java, and Jogjakarta Special Region in Indonesia. The accuracy level of this research is 92.37%, far better than other methods that are weighting-based. It then can be concluded that previous landslide incidents can be used as a warning to be alert on potential landslide in the future.

Landslides in different areas may have different triggering factor. Different geographical condition may explain the differences. It is claimed that ID3 algorithm based

on local-learning from previous incidents could

accommodate differences in geographical conditions in different areas.

This research suggests that Land Cover is the main triggering attribute of landslide in Java, Indonesia. Land use and conversion as a triggering attribute of landslide is heavily influenced by societal dynamics and government policy. Therefore, it is expected that our research could inform governments in making land use planning, especially in Java Island. Planning for settlement area must also combine other attributes, such as rainfall, slope, and soil type. For settlement in no landslide-potential area, one has taken climate into account, especially in the area with low rainfall (less than 2000 mm/year) and flat slope (less than 15%). Meanwhile, areas with high rainfall (more than 3000 mm/ year) and steep slope (more than 30%) should not serve as settlement areas but must be preserved as natural forest.

It is expected that future research could add other landslide triggering attributes. Besides, one can use more detailed scale in the discretization process or use other decision tree method such as C4.5. Further, similar research in other areas is welcome since each areas has its distinctive geographical characteristics.

ACKNOWLEDGMENT

The author wishes to thank the Indonesian National Disaster Prevention that Provided the data landslide for the analysis.

REFERENCES

[1] T. Kavzoglu, E.K. Sahin, I. Colkesen, “Landslide Susceptibility Mapping using GIS-Based Multi-Criteria Decision Analysis, Support Vector Machines, and Logistic Regression”, Landslides, vol. 11 no. 3, pp. 425 – 439, June 2014.

[2] C. Yilmaz, T. Topal, M.L. Suzen, “GIS-Based Landslide Susceptibility Mapping using Bivariate Statistical Analysis in Devrek (Zonguldak-Turkey)”, Environmental Earth Science, vol. 65, pp. 2161 – 2178, July 2011.

[3] Indonesia Disaster Information on December 2014, BNPB, Jakarta,

2014.

[4] M.S. Alkhasawneh, U.K. Ngah, “Modeling and Testing Landslide Hazard Using Decision Tree”, Journal of Applied Mathematics, vol. 2014, pp. 1-9, February 2014.

[5] M.S. Tehrany, M.N. Jebur, B. Pradhan, “Spatial Prediction of Flood Susceptible Areas using Rule Based Decision Tree (DT) and a Novel Ensemble Bivariate and Multivariate Statistical Models in GIS”,

Journal of Hydrology, vol. 504, pp. 69-79, September 2013.

[6] A.J. Myles, an Introduction to Decision Tree Modeling, J. Chemometer, 2004.

[7] R.B. Kheir, “Spatial Soil Zinc Content Distribution from Terrain Parameters: a GIS-Based Decision-Tree Model in Lebanon”,

Environment Pollution, vol. 158, pp. 520-528, 2010.

[8] O.O. Adeyemo, T.O. Adeyeye, “Comparative Study of ID3/C4.5 Decision tree and Multilayer Perceptron Algorithms for the Prediction of Typhoid Fever”, African Journal of Computing and ICT, vol. 8 no.1, pp. 103-112, March 2015.

[9] The Law of Public Work’s Ministry Republic of Indonesia No.

22/PRT/M/2007 on Disaster Management, Departemen Pekerjaan

Umum Republik Indonesia, Jakarta, 2007.

[10] H. Bhalekar, S. Kumbhar, “Pre-processing data using ID3 classifier”, International Journal of Engineering and Techniques, vol. 1 no. 3, pp. 68-73, June 2015.

[11] The Rule of Soil Damage Mapping, Kementerian Lingkungan Hidup

Republik Indonesia,Jakarta, 2009.

[12] K. Adhatrao, A. Gaykar, “Predicting Students’ Performance Using ID3 and C4.5 Classification Algorithms”, International Journal of Data

Mining and Knowledge Management Process (IJDKP), vol. 3 no. 5,

pp. 39-52, September 2013.

[13] M. Slocum, “Decision Making Using ID3 Algorithm”, InSight: River

Academic Journal, vol. 8 no. 2, pp. 1-12,2012.

[14] D.L. Gupta, A.K. Malviya, “Performance Analysis of Classification Tree Learning Algorithms, International Journal of Computer Applications”, International Journal of Computer Applications, vol. 55 n. 6, pp. 39-44,October 2012.

[15] D.T. Bui, “Landslide Susceptibility Assessment in Vietnam Using Support Vector Machines, Decision Tree, and Naive Bayes Models”, Mathematical Problems in Engineering, vol. 2012, pp. 1-26, April 2012.

[16] D.M. Farid, L. Zhang, “Hybrid decision tree and naïve Bayes classifiers for multi-class classification tasks”, Expert System with Application, vol. 41, pp. 1937-1946, 2014.

[17] Maridi, A. Saputra, “Role of Vegetation for Water and Soil Conservation in Watershed: Case Study in 3 Sub-Watershed of Bengawan Solo (Keduang, Dengkeng, dan Samin)”, in National

Seminar on Conservation and Utilization of Natural Resources,

Surakarta Indonesia, 2015.

(17)

8

AUTHORS’INFORMATION

Yeremia A. Susetyo, Master student in Faculty of Information Technology, Satya Wacana Christian University, Salatiga. Finished Bachelor degree in information technology about artificial intelligence and Geographic Information System.

Daniel F. H. Manongga is a professor and Head of Master Program in Information Systems, Faculty of Information Technology, Satya Wacana Christian University, Indonesia. Received his B.Eng (Electronics) from Satya Wacana Christian, University Indonesia. MSc (Information Technology) from Queen Mary College, University of London, and PhD (Management Sciences) from University of East Anglia, UK. His research interests include operation research and business intelligence

(18)

(19)