Extreme Learning Machine Weight Optimization Using Particle Swarm Optimization to Identify Sugar Cane Disease

(1)

Journal of Information Technology and Computer Science Volume 4, Number 2, September 2019, pp. 127-136

Journal Homepage: www.jitecs.ub.ac.id

Extreme Learning Machine Weight Optimization Using Particle Swarm Optimization to Identify Sugar Cane

Disease

Mukhammad Wildan Alauddin¹, Wayan Firdaus Mahmudy², Abdul Latief Abadi³

12Faculty of Computer Science, Brawijaya University

3Faculty of Agriculture, Brawijaya University

{[email protected], [email protected], [email protected]}

Received 21 May 2019; accepted 09 September 2019

Abstract. Sugar cane disease is a major factor in reducing sugar cane yields.

The low intensity of experts to go into the field to check the condition of sugar cane causes the handling of sugar cane disease tends to be slow. This problem can be solved by instilling expert intelligence on sugar cane into an expert system. In this study the method of classification of sugar cane disease was proposed using Extreme Learning Machine (ELM). ELM is predicted to produce high classification accuracy in a short time because the calculation process is simple and does not require iteration. However, ELM alone is not enough to classify multilabel and multiclass disease case data in this study.

Therefore, it is proposed to optimize the weight of hidden neurons in ELM using Particle Swarm Optimization (PSO). The experimental results show that the classification using ELM alone can reach an accuracy rate of 71%. After the weight of hidden neurons from ELM was optimized, the accuracy rate became 79.92% or an increase of 8.92%.

1. Introduction

Sugar cane disease is the most detrimental factor for sugar cane farmers because it causes a decrease in the amount of crop [1]. There is no quick response from sugar cane experts on plantations causing the spread of the disease to be difficult to control.

Efforts to cure plants affected by the disease are still not optimal because there is no attempt to conduct detection as early as possible. This is what happened in the last five years in the City of Pasuruan, East Java Province.

Several efforts to overcome the decline in sugar cane yields have been carried out by the Pasuruan City Government. The effort is to socialize the handling of sugar cane diseases and the distribution of fertilizer subsidies. But these efforts have not been able to significantly increase crop yields. The low intensity of experts in conducting surveys on sugar cane fields that have been affected by the disease worsens the situation. But this can be overcome by instilling expert intelligence on sugar cane into a computer-based application system.

Some studies that have used expert systems to detect sugar cane disease have been done before. However, these studies cannot detect the types of diseases that have

(2)

128 JITeCS Volume 4, Number 2, September 2019, 127-136 never been encountered but are found in other areas. One example is the expert system that has been studied by Hasan (2014) named CaneDES, although it has advantages in the visualization of symptoms and variations in the detection of pests and diseases but the system cannot be used to detect sugar cane diseases in Indonesia at this time [2]. Defining the disease to be detected is very important to do before designing the knowledge base in an expert system.

A good expert system is able to produce a high level of accuracy in detecting plant diseases. In addition, in one sugar cane plant is not always only attacked by one disease but also can be two types of diseases or more. This case is called multilabel data [3]. Even so, the types of diseases that can be found on the field are more than two possibilities. This case is referred to as multiclass data [4]. The more cases of multilabel and multiclass data, the more difficult the process of identifying the disease and the classification of the disease. Increasing the complexity of the problem is directly proportional to the need for computational time. Based on the consideration of several factors, this study proposed the development of Extreme Learning Machine (ELM).

One of the disadvantages of ELM is that the quality of performance results from its classification is strongly influenced by the accuracy of the weight values in its hidden neurons [5]. The initial weight value applied to classifying is not necessarily the best weight value. This can allow a high number of cases of misclassification. The problem of weight values will be solved by using the metaheuristic optimization method called Particle Swarm Optimization. The combination of ELM-PSO is believed to be able to produce classifications with a high degree of accuracy and does not require a long processing time.

1. Observation

The focus of this research is based on the reality that occurs in the field, specifically in Pasuruan City, East Java Province. In Pasuruan City there are 10 types of diseases found based on observations conducted in November 2018 to April 2019. The types of diseases are Pokkahboeng, Fire Injuries, Mosaic and Striped Mosaic, Ratoon Stunting, Blendok, Chlorosis Line, Yellow Stain, Red Stain, Orange and Rust Karat. In some cases, there are several symptoms in the same stem of sugar cane which indicate that the plant was attacked by more than one disease.

The observation process was carried out with researchers from the Indonesian Sugar Plantation Research Center (P3GI) in Pasuruan City. The limited duration of observation causes the primary data to only produce findings of diseases that attack sugar cane in the rainy season. Data collection on sugar cane disease that attacked during the dry season was obtained from data collected by P3GI in 2014. The data acquisition process during the observation was carried out using an instrument in the form of a list of questions addressed to sugar cane farmers. This instrument was prepared based on the results of discussions with experts. Besides the list of questions, there is also a data collection table for symptoms that appear. This list of symptoms is compiled based on a combination of symptoms found in the book of sugar cane plant diseases in Indonesia. [6]

Based on observational data there are 37 kinds of symptoms to tabulate as data to be processed. The symptoms are rolled leaves, leaves like cut off, odorless metal, faster growth, slow growth, leaves do not develop, leaves stop growing, dead leaves blacken, there is a mosaic pattern on leaves, mosaic patterns only on the upper surface of the leaves, patterns mosaics appear on young leaves, insects around plants, inner stems are red, leaf tissue dies, stems die, chlorosis lines on leaves, chlorosis lines parallel to leaf

(3)

Mukhammad et al. , Extreme Learning Machine: ...129 bones, chlorosis lines dry faster than normal leaves, chlorosis lines are 1 cm wide, yellow stains on the leaves, yellow stains seen from the top and bottom of the leaves, red stains on the leaves, stains in irregular shape, stain diameter around 10 mm, round and oval stains, stains measuring around 5 mm, stains can be observed from above the leaves, stains can observed from under the leaves, dry land, wetlands, occur in the rainy season, occur in the dry season, spread at night, wilted plants and dwarf plants.

Data on symptoms of the disease are then given a weight based on the value of the expert's trust in an illness. The higher the value of expert trust, the closer the relationship between a symptom as an indication of a particular disease.

3. Method

3.1 Dempster-Shafer

The Dempster-Shafer (DS) method is also known as the theory of belief functions [7]. The Dempster-Shafer Theory was introduced by Arthur Pentland Dempster in 1968 along with Glen Shafer in 1976 when they were conducting experiments to adapt probability theory into expert systems [8]. In this study, DS is used to measure the level of expert confidence in a sugar cane cane disease based on symptoms that arise. The result of weighting DS will be the input value that will be processed by the proposed method.

Belief is a measure of evidence strength in supporting a set of propositions [9]. If it is worth 0 (zero) then it indicates that there is no evidence and if it is worth 1 indicates certainty. The following is a formulation of the belief function:

𝐵𝑒𝑙(𝑋) = ∑ 𝑚(𝑌)

While plausibility or acceptance of trust is formulated as follows: 𝑌⋸𝑋

𝑃𝑙𝑠(𝑋) = 1 − 𝐵𝑒𝑙(𝑋

^′

) = 1 − ∑ 𝑚(𝑋

^′

)

where: 𝑌⋸𝑋

Bell (X) = Belief (X) Pls (X) = Plausibility (X) m (X) = mass function from (X) m (Y) = mass function from (Y)

Plausibility is also worth 0 to 1, if you are sure of X 'you can say Belief (X') = 1 so that from the formula above the value Pls (X) = 0. In the Dempster-Shafer theory there is also a frame of discernment denoted by Θ. This FOD is a universe of conversation from a set of hypotheses so that it is often called an environment.

3.2 Extreme Learning Machine

Extreme Learning Machine (ELM) is one type of artificial neural network architecture that was first proposed by Professor Wang in 2004 [10]. ELM is also referred to as the Single Hidden Layer Forward Neural Network (SHLFNN) because its architecture consists of only one hidden layer and applies advanced propagation

(4)

130 JITeCS Volume 4, Number 2, September 2019, 127-136 system without any looping process [11]. The application of practical mathematical calculations using this architecture causes data processing for forecasting and classification to be done very quickly [12]. The ELM architecture is illustrated in Figure 1.

Figure 1. Extreme Learning Machine Architecture [13]

Extreme Learning Machine is widely used in case of identification. This method is often used because very fast computing time also gives good results. In Figure 2 is a flow diagram of the Extreme Learning Machine method in general. The following is an explanation of the Extreme Learning Machine method flow:

1. Inputs to this process are training data, test data, and results of weight value optimization from Particle Swarm Optimization.

2. Conduct ELM training on training data.

3. Obtain a matrix of output weights β resulting from ELM training to be processed into testing.

4. Conduct ELM testing based on the weight matrix that has been obtained from the ELM training for test data.

5. The results of this method are in the form of accuracy values on this system.

(5)

Mukhammad et al. , Extreme Learning Machine: ...131

Figure 2. Flow Diagram of the Extreme Learning Machine

3.3 Particle Swarm Optimization

Particle Swarm Optimization (PSO) algorithm is a metaheuristic algorithm with computational evolution techniques that are motivated through a set of social behaviors. This algorithm was first introduced by Kennedy and Eberhart in 1995 [14].

The Particle Swarm Optimization algorithm will model the best solution activity in the search space, the position of particles in the solution space is the optimization variables used as optimization candidates [15]. Each of these positions will be associated with objective values or referred to as fitness values .

Particle Swarm Optimization is different from other optimization algorithms because it does not use information gradients in searching for solutions so that it does not result in continuous function error requirements [16]. Sedighizadeh et al [17]

research there are several terms that are often used in the Particle Swarm Optimization algorithm, including:

1. Swarm: population contained in an algorithm

2. Particles: part (member) of swarm, each particle will represent a solution to the problem to be solved.

3. Pbest (Personal Best): the best position ever achieved to get the best solution.

4. Gbest (Global Best): the best position of the particle as a whole

5. Velocity (vector): a vector that moves the optimization process that determines the direction of a particle needed to move which functions to improve its original position.

6. Inertia weight: inertial weight, used to control the impact of the velocity given by a particle.

7. Acceleration coefficient: the acceleration coefficient will affect the maximum distance a particle can take in an iteration.

There are several stages in implementing the PSO algorithm, here are the steps in PSO:

1. Initialize populations of particles with position and velocity randomly in a

(6)

132 JITeCS Volume 4, Number 2, September 2019, 127-136 search dimension space.

2. Evaluate desired optimization fitness functions in variables for each particle.

3. Compare the evaluation of particle fitness with Pbest. If the value is better than the Pbest value, then the value will be set as Pbest.

4. Perform Velocity Update and position for each particle.

5. Return to step 2 until the criteria are met, usually stopping at a fairly good fitness value or reaching the maximum number of iterations (Kennedy and Eberhart, 1995).

Each particle will maintain its position, which consists of fitness that has been evaluated. In addition, each particle can remember the best fitness value ever achieved during the operation of the algorithm, called the best fitness particle and the candidate solution achieved by fitness is called the best particle position (Pbest). The PSO algorithm will also maintain the best overall fitness value called (Gbest). To make a speed change the PSO algorithm is represented in the following equation:

𝑣_𝑖^𝑘+1= 𝑊𝑉_𝑖^𝑘+ 𝐶1 𝑟𝑎𝑛𝑑1 𝑥 (𝑃_{𝑏𝑒𝑠𝑡}− 𝑥_𝑖^𝑘) + 𝐶2 𝑟𝑎𝑛𝑑2 𝑥 (𝐺_{𝑏𝑒𝑠𝑡})

with:

𝑣_𝑖^𝑘 : speed of agent I in the iteration k W : ballast function (Inertia)

𝑐_𝑖 : ballast factor

Rand : a random value between 0 and 1 𝑥_𝑖^𝑘 : the last position of agent I in the iteration K 𝑃_{𝑏𝑒𝑠𝑡}: best value from agent i

𝐺_{𝑏𝑒𝑠𝑡}: the best Pbest value from the herd 3.4 Solution Representation

Optimization of the weight value of neurons at Extreme Learning Machine using Particle Swarm Optimization aims to provide the best weight values in the Extreme Learning Machine process [5]. In Particle Swarm Optimization there are particles or a repetitionation of solutions that are the solution to this problem. Represtentation of the solutions formed can be seen in table 1.

Length of value for a repetition of the solution as many as the number of neurons used. Each value in each solution representation is representative of each neuron weight. In table 1 is an example of a solution representation if there are 9 neurons.

The first value to the ninth represents the first to ninth weight of a neuron.

Table 1. Example Representation Solution

N1 N2 N3 N4 N5 N6 N7 N8 N9

0,6 0,2 0 0,1 -0,5 -0,2 0,9 0,1 -0,4

Based on the representation of the solution that has been prepared then the optimization process is done using Particle Swarm Optimization. The process of optimizing the hidden neuron weight values of ELM is illustrated in the flow diagram in Figure 3.

(7)

Mukhammad et al. , Extreme Learning Machine: ...133

Figure 3. Flowchart of Optimization and Classification Process

In the classification section, the performance of the proposed algorithm, namely ELM-PSO is measured by 3 methods, namely the value of accuration, precission and recall.

1. Accuration

Accuration evaluation or level of accuracy is used to measure the proximity of the predicted results to their actual values. In calculating accuracy, it can be formulated in equation below:

𝐴𝑐𝑐𝑢𝑟𝑎𝑡𝑖𝑜𝑛 = (𝑇𝑃 + 𝑇𝑁) (𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁) 2. Precision

Evaluation of precision or level of precision is used to measure how closely the information provided by the classifier. A formula for calculating the level of precision given below:

𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = 𝑇𝑃 (𝑇𝑃 + 𝐹𝑃) 3. Recall

Recall evaluation or sensitivity to measure how much positive percent is predicted by the system. In calculating it, can be seen in equation below:

𝑅𝑒𝑐𝑎𝑙𝑙 = 𝑇𝑃 (𝑇𝑃 + 𝐹𝑁)

(8)

134 JITeCS Volume 4, Number 2, September 2019, 127-136 With:

TP is True Positive TN is True Negative FP is False Positive FN is False Negative

4. Experiment Result and Discussion

The initial testing is done by classifying the training data using only the Extreme Learning Machine (ELM) method with a limited variation in the number of hidden neurons from 1 to 20. The results of testing the best number of hidden neurons are shown in Figure 4.

Figure 4. Test results for the number of hidden neurons

Based on the testing of the number of neuronal hidden, it can be seen that the best ELM architecture for classifying data on this problem achieves the best performance when using 15 hidden neurons. At the time of testing using 15 hidden neurons, the accuracy of data classification reached 71% or the highest of the 20 tests with a different number of hidden neurons. This proves that adding a constant number of hidden neurons does not always increase accuracy significantly. In addition, this model is not necessarily the best model for solving other classification problems with different data. Every problem with different data characteristics can be solved by ELM with its own best model.

The next test is testing using ELM whose weight has been optimized using PSO.

This test is carried out 10 times with 13 training data with 1 addition in each subsequent test to determine the effect of the number of additional training data on the value of accuration, precission and recall. From the 10 test times the accuracy is calculated to determine the level of quality of performance classification results from the proposed method. Based on the testing of the number of particles and the number of iterations that have been carried out, ELM-PSO testing was carried out using PSO with 10 particles and 100,000 iterations.

15

23 27 29 25 31 39 35

44

56 55 59 63 69 71 55

62 63 64 66

0 10 20 30 40 50 60 70 80

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Accuration

Number of hidden neurons

(9)

Mukhammad et al. , Extreme Learning Machine: ...135 The results of the ELM-PSO test can be seen in Table 2.

Table 2. Results of ELM-PSO testing Number of

test

Accuration Precision Recall Training Testing

1 75,24% 68,72% 70,03% 13 137

2 76,61% 69,07% 76,21% 14 137

3 77,29% 68,21% 70,33% 15 137

4 79,29% 64,07% 86,21% 16 137

5 80,14% 68,21% 86,21% 17 137

6 81,12% 71,07% 75,21% 18 137

7 80,31% 75,07% 75,33% 19 137

8 81,84% 71,21% 82,54% 20 137

9 81,32% 73,21% 75,55% 21 137

10 80,29% 80,07% 72,09% 22 137

Average 79,92% 78,55% 79,04% 17,5 137

Based on Table 2, it can be seen that there was an increase in the classification accuracy using ELM-PSO which was able to achieve an average accuracy of 79.92%

compared to the classification using only ELM which only reached an accuracy rate of 71%. This proves that PSO has managed to find a better solution to improve the weight values in ELM hidden neurons. In addition, based on the test results in Table 2, the addition of one training data in each test did not significantly affect the results of classification accuracy. This is caused by proportional training data and test data with a ratio of 10% compared to 90% which has not given a big influence. The addition of training data as much as possible is done with even more numbers with the number of test data that is fixed to determine the effect of the amount of training data on the results of testing the proposed method at the time of testing using test data.

5. Conclusion

Based on the research that has been done, it can be concluded that the artificial neural network method with the Extreme Learning Machine (ELM) model can solve the problem of sugar cane disease classification with multilevel and multilabel symptom data with an accuracy of 71% with 15 hidden layers. the ELM layer was optimized using Particle Swarm Optimization (PSO), there was an increase in the accuracy of the results to 79.92% or an increase of 8.92% from 10 times. This shows that PSO has found the hidden layer weight value which is better than the weight value of the original ELM hidden layer. In future experiments, the number of training will be added and an analysis of its effect on ELM-PSO performance.

References

[1] W. Li et al., “Identification of Resistance to Sugarcane streak mosaic virus (SCSMV) and Sorghum Mosaic Virus (SrMV ) in New Elite Sugarcane Varieties/Clones in China,” Crop Prot., vol. 110, no. March, pp. 77–82, 2018.

[2] S. S. Hasan et al., “CaneDES : A Web-Based Expert System for Disorder

(10)

136 JITeCS Volume 4, Number 2, September 2019, 127-136

Diagnosis in Sugarcane CaneDES : A Web-Based Expert System for Disorder Diagnosis in Sugarcane,” Sugar Tech, no. November, 2015.

[3] Y. Cheng, D. Zhao, Y. Wang, and G. Pei, “Multi-label Learning with Kernel Extreme Learning Machine AutoEncoder,” Knowledge-Based Syst., 2019.

[4] D. Silva-palacios, C. Ferri, and M. J. Ramirez-Quintana, “Probabilistic Class Hierarchies for Multiclass Classification,” J. Comput. Sci., vol. 26, pp. 254–

263, 2018.

[5] A. N. Alfiyatin, A. M. Rizki, W. F. Mahmudy, and C. F. Ananda, “Extreme Learning Machine and Particle Swarm Optimization for Inflation

Forecasting,” vol. 10, no. 4, pp. 473–478, 2019.

[6] H. Semangun, Penyakit-penyakit Tanaman Perkebunan di Indonesia, Second Edi. Yogyakarta: Gadjah Mada University Press, 1989.

[7] T. H. Saragih, W. F. Mahmudy, and Y. P. Anggodo, “Optimization of Dempster-Shafer’s Believe Value Using Genetic Algorithm for Identification of Plant Diseases Jatropha Curcas,” Indones. J. Electr. Eng. Comput. Sci., vol.

1, no. 12, 2018.

[8] T. Denœux, “Logistic Regression, Neural Networks and Dempster-Shafer Theory: a New Perspective,” pp. 54–67, 2019.

[9] S. Vijayabalaji and A. Ramesh, “Belief Interval-valued Soft Set,” Expert Syst.

Appl., vol. 119, pp. 262–271, 2019.

[10] G.-B. Huang, Q.-Y. Zhu, and C.-K. Siew, “Extreme Learning Machine:

Theory and Applications,” Neurocomputing, vol. 1, no. 70, pp. 489–501, 2006.

[11] G. Huang, G. Bin Huang, S. Song, and K. You, “Trends in extreme learning machines: A review,” Neural Networks. 2015.

[12] M. Bucurica, R. Dogaru, and I. Dogaru, “A comparison of Extreme Learning Machine and Support Vector Machine classifiers,” 11th IEEE Int. Conf. Intell.

Comput. Commun. Process. ICCP 2015, pp. 471–474, 2015.

[13] A. Bueno-Crespo, P. J. Garcia-Laencina, and J.-L. Sancho-Gomez, “Neural architecture design based on Extreme Learning Machine,” 2013.

[14] J. Kennedy, R. C. Eberhart, and Y. Shi, “Chapter Seven - The Particle Swarm,” in Swarm intelligence, Elsevier Inc., 2001, pp. 287–325.

[15] G. A. Alfarisy, W. F. Mahmudy, and M. H. Natsir, “Optimizing Laying Hen Diet Using Particle Swarm Optimization with Two Swarms,” J. Telecommun.

Electron. Comput. Eng., vol. 10, no. 1–6, pp. 113–119, 2018.

[16] N. Nouaouria and M. Boukadoum, “Particle Swarm Classification for High Dimensional Data Sets,” in International Conference on Tools with Artificial Intelligence (ICTAI), 2010.

[17] D. Sedighizadeh and E. Masehian, “Particle Swarm Optimization Methods, Taxonomy and Applications,” vol. 1, no. 5, pp. 486–502, 2009.