ICGC-RCICT
2010
2-3 March 2010
ISSN: 2086-4868
Department of Electrical Engineering
and Information Technology
Faculty of Engineering
Gadjah Mada University
Jalan Grafika no 2 Kampus UGM
Yogyakarta, 55281, Indonesia
PROCEEDINGS OF
AND
THE FIRST INTERNATIONAL CONFERENCE
ON GREEN COMPUTING
THE SECOND AUN/SEED-NET
REGIONAL CONFERENCE ON ICT
DEPARTMENT OF ELECTRICAL ENGINEERING
AND INFORMATION TECHNOLOGY
FACULTY OF ENGINEERING
GADJAH MADA UNIVERSITY
Proceedings of
The First International Conference on Green Computing
and
The Second
AUN/SEED-Net Regional Conference on ICT
Table of Contents
Inner Cover i
Organizing Committee ii
Foreword iii
Schedule v
Table of Contents ix
Part 1: SPECIAL LECTURE
Lec #1
Energy Efficiency: A Mandatory Driver for a Changing World 1
Aris Dinamika
Lec #2
Experimental Platform in 5-GHz Band Wireless Communication Using an 80-MHz Bandwidth MIMO-OFDM System
13
Singo Yoshizawa, Shinya Odagiri, Yasuhiro Asai, Takashi Gunji, Takashi Saito, Yoshikazu Miyanaga
Lec #3
Virtual Reality and Augmented Reality and their Applications to Medical Diagnosis and Green Computing
17
Kazuhiko HAMAMOTO
Lec #4
Ubiquitous Sensor Networks for Monitoring the Environment: Adaptive Sensing Based on Profiles 23
Yoshiteru Ishida
Lec #5
Hierarchical Consensus for Large Scale Networked Dynamical Systems 28
Shinji Hara
Lec #6
Filter Multicast: A Communication Support for Dynamic Vehicle Platoon Management 34
Hiroshi Shigeno and Ami Uchikawa
Part 2: TECHNICAL PAPERS
CAM #1
Analysis of Single-Phase Passive and Active EMI Filter Performance for Induction Motor Drive 41
Chanthea Khun, Werachet Khan-gnern, Masaaki Kando
CAM #2
Dynamic Modeling and Control of a SEPIC Converter in Discontinuous Conduction Model 45
Vuthchhay Eng and Chanin Bunlaksananusorn
CAM #3
Prototype of Sound-based News on Demand Application in Mutli-languages for partial/fully illiterates 52
Annanda Thavymony RATH
INA #1
Experiences in Implementing General-Purpose Applications on CUDA 56
Achmad Imam Kistijantoro
IND #1
Epitomizing Green Computing 60
P.U. Bhalchandra , Dr.S.D.Khamitkar , N.K.Deshmukh , S.N.Lokhande
JPN #1
Range-Free Localization Algorithm Using Distance Deviation Weighted Factor in Wireless Sensor Network
68
Thida Zin Myint, Tomoaki Ohtsuki
JPN #2
Stereo-Based Ground Plane Estimation: A Reliable Approach for Low Texure Stereo Images 74
Sach Thanh LE, Kazuhiko HAMAMOTO, Kiyoaki ATSUTA, Shozo KONDO
LAO #1
3D Reconstruction from X-ray Fluoroscopy using Exponential Opacity Setting for Differential Volume Rendering
82
ISSN: 2086-4868 ICGC-RCICT 2010
Yogyakarta, 2 – 3 March 2010 x
MAS #1
Flat Fibre Technology: Towards a Truly Flexible, Distributed Optical Sensor for Environmental Protection and Preservation
88
F.R. Mahamd Adikan, S.R. Sandoghchi
MAS #2
Compact Bismuth-based Erbium-doped Fiber Amplifier With Wide-band Operation 92
S. W. Harun, X. S. Cheng, H. Ahmad
MAS #3
Adopting Variable Strength Interaction Testing: Some Practical Issues 96
Kamal Z. Zamli, Mohammed I. Younis
MAS #4
Optimum Operation of Pump at Water Treatment Plant for Achieving Energy Saving 102
Mohamad Aris Ramlan, Amin Parvizi, Mohamad Rom Tamjis
MYA #1
Sustainable Development of ICT for Rural Areas in Myanmar 107
Khin Mar Soe
MYA #2
Green in ICT Utilization for Sustainable Development of Myanmar 112
Soe Soe Khaing
PHI #1
KineSpell2 A Full VAK Approach in Learning Spelling 117
Rose Ann Sale, Manica Dimaiwat, Ma. Rowena Solamo, Rommel Feria
PHI #2
Event Driven Reconfigurable Architecture for Real-time Multiple Human Motion Tracking and Profiling
121
Cesar A. Llorente
SIN #1
Fair End-to-end Bandwidth Allocation (FEBA) Algorithms Observation and Improvement 126
Nhan H. Truong, Trang T.T. Do, Duong Tran
SIN #2
Cross Directional Rectangle Search for Fast Block-Matching Motion Estimation 132
Binh P. Nguyen, Trang T. T. Do
SIN #3
A High-Accuracy and High-Speed 2-D 8x8 Discrete Cosine Transform Design 135
Trang T.T. Do, Binh P. Nguyen
THA #1
Error Concealment in H.264 Spatial Scalable Video using Improved Motion Estimation 140
Simon Jude Q. Lam, Supavadee Aramvith
THA #2
Evaluation of Blind Modulation Detection in Adaptive OFDM Systems 142
Donekeo LAKANCHANH, Shingo YOSHIZAWA, Suthichai NOPPANAKEEPONG, Yoshikazu MIYANAGA
THA #3
An Algorithm for Q-Factor Evaluation A Case Study on 40 Gbps Hybrid Amplified Optical Transmission System
147
Nayot Kurukitkoson
THA #4
On-line Verification Algorithm for Flexible Interval Representation System 152
Napat Rujeerapaiboon, Athasit Surarerks
THA #5
Active Contour Using Local Region Based Force with Adaptive Length Search Line 157
Sopon Pumeechanya, Charnchai Pluempitiriyawej, Saowapak Thongvigitmanee
THA #6
Running Network Intrusion Detection System on a Recycled Personal Computer 163
Pornchai Korpraserttaworn and Surin Kittitornkun
THA #7
Directional Filtered Fingerprint Images: An Investigation and Evaluation of Minutiae Consistency 167
Keokanlaya Sihalath, Somsat Choomchuay, Kazuhiko Hamamoto
VIE #1
A Proposal of Internet-based Monitoring and Control Systems Using OPC UA 172
VIE #2
Providing Qualified Intensional Answers Using Fuzzy Concept Hierarchies 177
Phong Tuan Ngo, Phuong Hong Nguyen, Anh Kim Nguyen
VIE #3
Credibility Checking-Based Scheduling Schemes for Desktop Grid Computing 186
Son Hong Ngo, Masaru Fukushi, Xiaohong Jiang
VIE #4
New Method to Implement Intelligent Street Lighting System in Vietnam 190
Khoi Phan-Dinh
VIE #5
High-throughput Pattern Matching Engine for Network Intrusion Detection System 194
Tran Ngoc Thinh, Surin Kittitornkun
Bali #1
SMS-Based Technology for Data School Collecting in Bali 199
I Ketut Gede Darma Putra, I Putu Agung Bayupati, AA. Kompiang Oka Sudana
Jaka #1
Design and Implementation of Solar Tracker in Solar Energy Conversion Systems 205
Hadian Satria Utama, Endah Setyaningsih, John Son
Jaka #2
The Assesment of Quality of Service (QoS) in IP/MPLS Network 209
Yulianus, Riri Fitri Sari
Madu #1
Smile Stages Recognition in Orthodontic Rehabilitation Using 2D-PCA Feature Extraction 214
Rima Tri Wahyuningrum, Mauridhi Hery Purnomo, I Ketut Eddy Purnama
Maka #1
The Internet Protocol Design Framework to Real Time Communication Application Development 217
Mingsep Sampebua, Lukito Edi Nugroho, Jazi Eko Istiyanto
Pale #1
Reliability Measurement of Internet Services 225
Deris Stiawan, Abdul Hanan, Mohd. Yazid Idrus
Pale #2
An Integrated Model of Collaborative Learning in Higher Education 230
Rendra Gustriansyah
Papu #1
The Evaluation of Non-fundamental Frequency Apparent Power and Fundamental Unbalanced Load Apparent Power Measurements According to IEEE Standard 1459-2000 in the Big Consumers
237
Maurits A. Paath, Suharyanto, Tumiran, Avrin Nur Widyastuti
Sala #1
Data Authentication in Network Forensic Using MD5 and CRC32 Method 245
Irwan Sembiring, Jazi Eko Istiyanto
Sema #1
Application of Decision Support System to Determine the Level of Drug-Resistance Tuberculosis 253
Sari Wijayanti, Adi Setiya Dwi Grahito, Kuwat Triyana
Sema #2
Image Processing Intelligent System Iris Eye Real-Time Compound Distribution Planning At Eye Center Sultan Agung Hospital
257
Ida Widihastuti, Sari Ayu Wulandari
Sura #1
Energy-Efficient Protocol of Wireless Sensor Network using Carrier Signaling 264
A. Sjamsjiar Rachman, Wirawan, Gamantyo Hendrantoro
Sura #2
Performance Evaluation of Adaptive Coded Modulation and Maximal Ratio Combining in Millimeter-Wave Fixed Cellular System Under the Impact of Rain Attenuation and Interference in Tropical Region
272
S. Tahcfulloh, Suwadi, G. Hendrantoro
Sura #3
Efficiency Comparison on Eclat, FP-Tree, and Top-Down Algorithms in Search L-Itemsets 276
Gunawan, I.K.E Purnama
Sura #4
Collocation Detection for Indonesian 281
ISSN: 2086-4868 ICGC-RCICT 2010
Yogyakarta, 2 – 3 March 2010 xii
Sura #5
Optimization of Size and Location of Capacitor Banks on Distribution Systems Using an Improved Adaptive Genetic Algorithm
285
Patria Julianto, Imam Robandi
Sura #6
Position Control of Manipulator’s Links Using Artificial Neural Network with Backpropagation Training Algorithm
290
Thiang, Handry Khoswanto, Tan Hendra Sutanto
Sura #7
Maximum Power Point Tracker Using Buck-Boost Converter as Variable Resistance 295
Vita Lystianingrum, Hendra Hardiansyah, Mochamad Ashari
Sura #8
Filtering of normal, and laryngectomiced patient voices using ANFIS: A Preliminary evaluation of ANFIS algorithm compared to LPF, median, and average Filter
299
Andy Noortjahja, Tri Arief Sardjono, MH Purnomo
Sura #9
Compression and Reconstruction of Digital Dental Image on Fuzzy Transform 304
Ingrid Nurtanio, I Ketut Eddy Purnama, Mauridhi Hery Purnomo
Sura #10
Simulation of Desicion Support System for Pricing Grid Enterprise in Renderring Farm Using Neural Network
308
Lilik Anifah, Tommy Alma marif, Haryanto, Buddy Setiawan, Darda Insan Alam, I Ketut Edi Purnama, Mochammad Hariadi, Mauridhi Hery Purnomo
Sura #11
Backpropagation Neural Network based Reference Control Modeling for Mobile Solar Tracker on a Large Ship
314
Budhy Setiawan, Mochamad Ashari, Mauridhi Hery Purnomo
Sura #12
Design and Performance Analysis of Speed Controller in Induction Motor with Sliding Mode Control 320
Mardlijah, Lusiana Prastiwi, M Hery Purnomo
Sura #13
High Performance of Nonlinear DC Motor Speed Control using Backpropagation Neural Network 325
Saidah, Mohamad Ashari, Mauridhi Heri Purnomo
AAU #1
Knowledge Sharing in Knowledge-Growing-based Systems 329
Arwin Datumaya Wahyudi Sumari, Adang Suwandi Ahmad, Aciek Ida Wuryandari, Jaka Sembiring
AAU #2
Savings Time Execution Prima Numbers Generator Using Bit-Array Structure 334
Rakhmat Hidayat, A. Rida Ismu Windyarto, Herdjunanto, Samiadji, Himawan, H
ISTA Design and Implementation of M-Learning for Increasing Flexibility of Learning System 340
Gatot Santoso, Edy Sutanta, Rifianto Suryawan
UNY #1
Providing User Quality of Service Specification for Communities with Low Connectivity 345
Ratna Wardani, F. Soesianto, Lukito Edi Nugroho, Ahmad Ashari
UNY #2
ElectroLarynx, Esopahgus, and Normal Speech Classification using Gradient Discent, Gradient discent with momentum and learning rate, and Levenberg-Marquardt Algorithm
351
Fatchul Arifin, Tri Arief Sardjono, Mauridhi Hery
UNY #3
Analysis of Green Computing Strategy in University: Analytic Network Process (ANP) Approach 358
Handaru Jati
UNY #4
Website Quality Evaluation Comparison: An Empirical Study in Asia 365
Handaru Jati
TETI #11
Adaptive LMS Noise Cancellation of Wideband Vehicle’s Noise Signals 372
Sri Arttini Dwi Prasetyowati, Adhi Susanto, Thomas Sriwidodo, Jazi Eko Istiyanto
TETI #12
Design of Power Plant Boiler Temperature Monitoring Application using Sound Card as ADC and Data Acquisition Device
379
TETI #21
Green Analysis : a System Approach A Case Study: Sensor Fault Detection and Isolation of a web winding system
384
Herdjunanto Samiadji, Soesianto, Susanto Adhi,Widodo Sri T
TETI #22
Computation Time Reduction as a Strategy to Reduce Energy in Green Design A Case Study: Model Parameter Estimation on Unwinding Part of a Web Winding System
380
Herdjunanto Samiadji
TETI #23
Embedded Webserver Applications for Industrial Process Monitoring 396
Risanuri Hidayat, Bambang Sutopo, Astria Nur Irfansyah, Ery Dwi Kurniawan
TETI #24
PLC-based Power Factor Regulator 401
Muh. Isnaeni B. S., Aprilyan Wahyu N., Astria Nur Irfansyah
TETI #25
Implementation of Low Cost Modbus Protocol-Enabled Embedded Interfaces for Industrial Data Communication
405
Astria Nur Irfansyah, Eny Sukani Rahayu
TETI #26
Gesture-based Mouse Cursor Movement Using Template-Matching: An Early Study 409
Paulus Insap Santosa
TETI #27
Low-Powered Wireless Sensor Network for Indoor Localization 412
Widyawan
TETI #28
Developing Multi-applicationed smart card system with ITSO standard 418
Enas Dhuhri Kusuma, Litasari
TETI #29
Image Processing Using ARM9-based Embedded System Case Study : Color based tracking using ARM9 processor and camera module
421
Enas Dhuhri Kusuma, Addin Suwastono, Wahyu Dewanto
TETI #30
QAM Mapper-Demapper for an Adaptive Modulation OFDM: from Learning Model to FPGA-Based Implementation
426
Budi Setiyanto, ‘Imaduddin, Iqbal Prayuda Aditya, Mulyana, Astria Nur Irfansyah, Risanuri Hidayat, Litasari
TETI #31
Information Technology Infrastructure Flexibility and The Enhancement of the Business/IT Strategy Alignment
431
Wahyuni
TETI #32
Analysis of Solar Cell Traffic Light System in Kota Yogyakarta 435
Data set
Figure 1. Association Rule Mining Process
Efficiency Comparison on Eclat, FP-Tree, and
Top-Down Algorithms in Search L-Itemsets
Gunawan *,**), I.K.E Purnama *)
Email: [email protected] [email protected]
*)
Department of Electrical Engineering, Faculty of Industrial Technology, Institut Teknologi Sepuluh Nopember Kampus ITS Sukolilo, Surabaya 60111, East Java, Indonesia.
**)
Department of Computer Science, Sekolah Tinggi Teknik Surabaya Ngagel Jaya Tengah 73-77, Surabaya 60284, East Java, Indonesia.
Abstract— We present the comparison in efficiency of the three famous algorithms, Eclat, FP-Tree, and Top-Down, to search large itemset in huge database. Eclat can reduce the amount of database scanning. This algorithm is started from minimal itemset and continued by intersecting among items. This process is redone until infrequent node is found (the support is less than minimum support). Top-Down algorithm is almost the same as Eclat. Items in the database are joined to be one itemset and they will be combined by deleting one item in the itemset in turns. The process is done recursively until it finds a frequent node. FP-Tree algorithm does the mining process without generating candidate itemsets. Original database is stored in a tree form, and then the mining process is done one by one for all frequent items in database. Those three algorithms have their own benefits and weaknesses. Eclat and Top-Down algorithm have their advantages in making branch of the tree. Eclat makes its branch by checking whether that combination is already in database or not. On other hand, Top-Down makes its branch by looking whether the branch is already made by another node or not. FP-Tree algorithm’s advantage is it can generate a little bit candidate itemsets unlike apriori. The weakness of Eclat and Top-Down is in the characteristic of the database. If the database has a lot of frequent items and the minimum support value is small, the combination process will waste the time. The weakness of FP-Tree is it saves the original database in a tree form, so it spends a lot of memory.
Keywords-data mining; association rule mining; Eclat; FP-Tree; Top-Down;
I. INTRODUCTION
Data mining has provided a huge contribution in the development of various disciplines. Data mining aims to find a pattern of a lot of data collection, and has an important role in corporate decision-making that has a large-scale work. Corporate transaction data is processed to discover patterns from these data. Before the pattern is found, the data must take several steps, such as data selection, data cleaning and preprocessing, data transformation, data reduction (elimination of unnecessary data), data mining task and algorithm selection , post-processing, and the last is to apply the techniques in data mining.
One of the tasks of data mining is to discover frequent item sets. Frequent item sets is pairs of data that often arise. Frequent itemset has important role in data mining that in the application to find specific patterns in the database, such as association rule, correlation, sequence,
episode, classifier, and cluster. Of all these patterns, one of the highlights is the association rule. Association rule based on searching of motivation to do an analysis of transactions in supermarkets. From the analysis of these transactions can studied about customer behavior in shopping. Association rule produces conclusions about how often a product is purchased in conjunction with other products.
Figure 1 shows that the data mining process starts from the preparation of data sets. Then the filtered data is brought to the next process that is selection of appropriate algorithms to search Litemset (Large Itemset).
There are various algorithms that can be used to discovery of large litemset. But in this study only Eclat algorithm, FP-Tree, and Top-Down will be discussed.
II. RELATED THEORIES
A. FP-Tree Algorithm
In this chapter will discuss about searching frequent pattern using the Frequent Pattern Tree (FP-Tree) algorithm. Input from the FP-Tree algorithm consists of two fields, which is the index field and target fields.
In this case, field index only use to sort the data. In the FP-Tree algorithm, there is no candidate generation process. This is caused by the formation of candidate generation is a factor that causes delays. Large Itemset
ISSN: 2086-4868 Technical Paper Proceedings of ICGC-RCICT 2010
Data set
Figure 2. Large Itemset Search Process with FP-Tree
Figure 3. First Step of FP-Tree search process by using the FP-Tree algorithm can be seen
in figure 2.
FP-Tree algorithm is an algorithm for mining the database which this algorithm minimizes the candidate search process. Candidate is search in the FP-Tree which is only often appears. Otherwise, FP-Tree use tree structure so it can avoid reading the database repeatedly. This is certainly different from Apriori algorithm that doing readings database continuously. Apriori algorithm is waste time.
1) Tree Construction
Data structure on FP-Tree must be right so that it can accommodate all the information required in the mining process accurately. Making data structures in the FP-Tree should be drawn up based on some facts that made observations as follows:
x Database search must be done at one time only.
Search this database must be able to classify the items that included in the frequent itemset. The frequency of appearance of these items should also be stored in addition to the name of the item itself. This is done because the mining process is only done on frequent item only. Frequent items will be removed.
x If the structure of frequent items storage of each
transaction is right then search the database repeatedly can be avoided.
x Transactions that have set of the same frequent
item can be merged into one and only need to increase the frequency. To be able to find two sets of item are same or not, each transaction must be ordered either in ascending or descending.
x In two transactions that have the same common
prefix based on a particular item frequent sequences, the same parts can be combined into a
number of frequencies. If the frequent items are sorted by frequency of occurrence desendingly the possibility of the emergence of the same prefix string will be greater.
2) Example of Tree Formation
The following will be given a transaction database, database is to clarify the construction of the FP-Tree. The database contains information of transaction id and item purchased as shown by the first two columns of table 1
and a minimum support İ .
The first step performed in FP-Tree algorithm, which is scanning the database to get the whole first frequent itemset and record the frequency of each item and it compared with the minimum support. If the frequency of itemset is below the minimum support then the itemset will not be used in next process.
Frequent itemset that has been obtained then sorted descendingly based on their frequency. The next step is to form the root of the tree is initialized with NIL. Database then scanned for the second time. Sequencing performed on the database on each transaction. The items on each transaction are sorted according to items name that are ordered based on the results of the previous process. Sequencing results in a database transaction can be seen in Table 1.
Data set
Search and Sort Frequent Item
Frequent Item
Combination Process for each Frequent
Item
Large Itemsets
Figure 4. Large Itemset Search Process with Eclat of all rows equal to one. Figure of FP-Tree formation of
the first transaction can be seen in figure 3. Process conducted in the same way until the fifth transaction.
3) Mining Process with FP-Tree
Generally, the process of mining using FP-Tree can be divided into 4 steps. The steps are intended as follows:
1. Mining in the FP-Tree starts from an item with
the smallest support in the header table.
2. Find frequent items that contain items that will be mining.
3. Look k- conditional pattern base for k items that
will be mining.
4. Form of conditional FP-Tree based on the
conditional pattern base that has been formed previously.
4) Conditional Pattern Base Search
After making the formation of the tree that has been given before which is based on mining step in the FP-Tree, the next thing to do is do a search conditional pattern base of each item that has been listed in the case example of table 1. The following example is to mining on the p-conditional pattern base.
The first mining in a given case starting from the node p. What to do is look at the header table, head-link from node p starts from where and to where. Based on the observations obtained frequent patterns containing p, which is (f: 4, c: 3, a: 3, m: 2, p: 2) and (c: 1, b: 1, p: 1). In the first path is shown that the path (f, c, a, m, p) appears twice in the database. Path (f, c, a) appears three times and the path (f) itself appears four times but only appeared twice together with the p. Therefore, the calculated path prefix is (f: 2, c: 2, a: 2, m: 2, p: 2).
While the path (c: 1, b: 1, p: 1), the path prefix used is (c: 1, b: 1). Of the two prefix paths will be formed p’ sub
pattern base termed p’ conditional pattern base.
Definition of p' conditional pattern base is a sub-pattern that occurs with the following conditions of p.
From the prefix path obtained for node p, then performed the counting support of the items contained in the prefix path. The results support the calculation of each item can be seen in Table 2.
Previously been known that the minimum support from the example of table 1 is 3. So from Table 2 can be seen that the eligible item is item c so that item c only the frequent items. Next is the formation of p conditional FP-Tree. From table 2 notice that only items c that qualifying the requirements, then p-conditional FP-Tree consists of only one branch only.
B. Eclat Algorithm
Eclat algorithm was first introduced by Mohammed Javeed Zaki in 1997. The emergence of this algorithm must not be separated from the Apriori. In the Eclat algorithm does not perform database read repeatedly as in the Apriori algorithm.
Input for Eclat algorithm consists of two fields, index field and target fields. Large Itemset search process by using the Eclat algorithm can be seen in figure 4.
1) Mining Process with Eclat
Mining process using the Eclat algorithm is basically done by taking slices among the items that qualifying the minimum support value. Operation frequent wedge between these items will certainly accelerate the process of mining by using the Eclat algorithm. If the frequent itemset is more longer then the number of slices operations performed will be less, so join process will be fewer as well.
Eclat algorithm does not require complex hash-tree structure. Therefore, the memory used by Eclat algorithm is not too much. Even Eclat algorithm has the advantage to allocate memory properly. It is because of the Eclat algorithm only needs to be done with a linear scan of the itemset. In the Eclat algorithm, the smaller value of minimum support will found the faster frequent itemset. As a result, mining process is also done faster. This will be very beneficial because it can reduce operational costs.
C. Top-Down Algorithm
In the previous chapter has discussed about the FP-Tree and Eclat algorithm. In this chapter will discuss the final algorithm, the Top-Down algorithm. Discussion of Top-Down algorithm will contain about how to obtain mining frequent itemset of the database provided. Large Itemset search process by using Top-Down algorithm can be seen in figure 5.
In the Top-Down algorithm, database mining starts from the maximal itemset. Maximal itemset is the itemset which the itemset is not part of another itemset. First, the items search in the database that is qualifying the minimum support. All frequent items produced will be
ISSN: 2086-4868 Technical Paper Proceedings of ICGC-RCICT 2010
Data set
Group and Search Frequent Item
Figure 5. Large Itemset Search Process with Top-Down merged into one and begin the process of combination of items one by one.
1) Mining Process with Top-Down
Mining process by using Top-Down algorithm is basically done by combining of the mining itemset. Combination process is meant to remove an item in itemset interchangeably. New itemset results from the combination will be mining again repeatedly until discovered frequent itemset or a search had reached the last level.
The principle to run Top-Down algorithm is mining process is done recursively until it was found that the nodes that are traced are frequent. After getting the frequent items, the next step is to form probability combination. New nodes are automatically generated as frequent itemset. The principle of Top-Down algorithm is very efficient. This is based on the fact that at the time of the formation of combinations, Top-Down algorithm always check on the other branches that if the combination has been previously established or not.
III. TESTING AND RESULT
In this chapter will be given a comparison of time and memory used at the time of mining with Eclat, FP-Tree, and Top-Down algorithms. Mining for testing program will use several databases with different support value. So at each mining on a database will be given some support value by using the three algorithms.
The process of testing conducted in this study using a computer with Pentium processor 4IV 1.8 GHz and 256 Mb of memory. The operating system used in this testing process is Microsoft Windows 2000. Database that will be used for testing is BMS-WebView-1 database which has total of 22,225 transactions, and using Movie database that has total of 5888 transactions.
In this chapter will be divided into two sub-chapters which the first section will explain about the time comparison and the second section will explain about the memory comparison that is used by the three algorithms.
A. Time Comparison
Time used to perform mining will vary for each algorithms. In some case of certain databases may Eclat algorithm has faster mining times than FP-Tree algorithm or on other database mining process with Top-Down algorithm faster than another algorithm. Mining time is also influenced by how much support is used.
TABLE III. USINGBMS-WEBVIEW-1 DATABASE (HH:MM:SS)
Min. Support
Algorithms
Eclat FP-Tree Top-Down
0.02 00:00:53 00:01:49 00:02:43
0.03 00:00:33 00:01:40 00:54:39
0.04 00:00:24 00:01:48 00:00:34
0.05 00:00:12 00:01:55 00:00:01
In table 3 it appears that Top-Down algorithm has the fastest time compared with Eclat and FP-Tree if the minimum support that used are 0.04 and 0.05. With that support, Eclat algorithm when compared with the FP-Tree has better time mining. However, the smaller of the support given in the BMS-WebView-1 database mining, the time mining with Top-Down algorithm worse than other algorithms. FP-Tree algorithm in this case has a lapse of time that was stable which is compared with other algorithms.
0.03 00:00:30 00:01:24 12:11:05
0.04 00:00:10 00:00:51 02:41:54
0.05 00:00:02 00:00:09 00:00:00
In table 4, the minimum support used in the experiments using movie database is 0.3, 0.4, and 0.5. When use support 0.5, Top-Down algorithm has the fastest time mining compared to other algorithms. However, when the minimum support is reduced, the time mining required for mining with Top-Down become more greater. In the case of movie database, Eclat algorithm has small time mining and stable when compared with FP-Tree and Top-Down algorithm.
B. Memory Comparison
Eclat, FP-Tree, and Top-Down algorithm has similarity in mining process, the algorithms are the same conduct tree formation which will be used in the database mining. The formation of the tree will use the available memory in computers. This section will discuss about the comparison of memory used in the mining process using the Eclat, FP-Tree, and Top-Down algorithm.
has a change in the amount of memory was stable between the three algorithms.
TABLE V. USINGBMS-WEBVIEW-1 DATABASE
Min. Support
Algorithms
Eclat FP-Tree Top-Down
0.02 0.2M 7.3M 3.8M
0.03 0.07M 5.5M 2.6M
0.04 0.05M 5.0M 0.04M
0.05 0.02M 4.6M 0.002M
TABLE VI. USINGMOVIEDATABASE
Min. Support
Algorithms
Eclat FP-Tree Top-Down
0.03 0.02M 10.0M 20.0M
0.04 0.007M 6.3M 4.7M
0.05 0.002M 1.9M 0.0008M
In table 6 can be seen that, the process of mining by using the Eclat algorithm was needed less amount of memory when compared with other algorithms. Meanwhile, at a minimum the use of support 0.4 and 0.5, it need larger memory when using the Top-Down algorithm and so did the difference amount of memory required when using support 0.3 and 0.4.
IV. CONCLUSION
In making this study, authors obtain some conclusions include:
x Disappearance process of unfrequent items makes
FPTree formation faster. Elimination of items produce a new table that contains frequent items that will accelerate the formation of tree. The formation of tree no longer access the original database, but only access the new table.
x FP-Tree has a fast time mining even use larger
size dataset. While lack of FP-Tree is this algorithm takes a large memory.
x Top-Down algorithm does not need to do pruning
like FP-Tree. Top-Down weakness is when the type of items and lots of frequent items and the minimum of the support will waste time because Top-Down continue to search until it found frequent itemset.
x Branch formation by using the Eclat algorithm
always look in advance whether the combination is contained in the database or not. This will avoid the formation of branches that are not needed. The weakness is the algorithm always doing support calculations despite the itemset is already frequent because this algorithm will stop if found unfrequent itemset.
REFERENCES
[1] R. Agrawal, R. Srikant, “Fast Algorithms for Mining Association Rules”, San Jose: IBM Almaden Research Centre.
[2] R. Agrawal, R. Srikant, “Mining sequential patterns: Generalization and performance improvements”, In Proceedings 1st International Conference Knowledge Discovery and Data Mining, 1993.
[3] C. Borgelt, “Efficient Implementation of Apriori and Eclat”, Unpublised, Magdeburg: Department of Knowledge Processing and Language Engineering School of Computer Science, Otto-von-Guericke-University of Magdeburg.
[4] B. Goethals, “Memory Issues in Frequent Itemset Mining”, Unpublised, Helsinki: Department of Computer Science, University of Helsinki.
[5] J. Han, J. Pei, Y. Yin, “Mining Frequent Pattern Without Candidate Generation”, Unpublised, School of Computer Science, Simon Fraser University.
[6] J. Han, M. Kamber, “Data Mining: Concepts and Techniques”, 1999.
[7] S. Parthasarathy, M.J. Zaki, M. Ogihara, W. Li, “New Algorithms for Fast Discovery of Association Rules”, New York: Computer Science Department, University of Rochester, July 1997.
[8] M.J. Zaki, “Scalable Data Mining for Rules”, New York: Department of Computer Science, University of Rochester, 1998.
ISSN: 2086-4868 Technical Paper Proceedings of ICGC-RCICT 2010