Wood Species Identification using Convolutional Neural Network (CNN) Architectures on Macroscopic Images

(1)

Journal of Information Technology and Computer Science Volume 4, Number 3, Desember 2019, pp. 274-283

Journal Homepage: www.jitecs.ub.ac.id

Wood Species Identification using Convolutional Neural Network (CNN) Architectures on Macroscopic Images

Anindita Safna Oktaria¹, Esa Prakasa², Efri Suhartono³, Bambang Sugiarto⁴, Dicky R Prajitno⁵

,

Riyo Wardoyo⁶

1,2,3,4,5,6 Computer Vision Research Group, Research Center of Informatics, Indonesian Institute of Sciences, Bandung, Indonesia

1,3, Telecommunication Engineering Department, Faculty of Electrical Engineering, Universitas Telkom, Bandung, Indonesia

{[email protected], [email protected], [email protected] }

Received 13 November 2019; accepted 02 Desember 2019

Abstract. Indonesia is a country that is very rich in tree species that grow in forests. Wood growth in Indonesia consists of around 4000 species that have different names and characteristics. These differences can determine the quality and exact use of each type of wood. The procedure of standard identification is currently still carried out through visual observation by the wood anatomist. The wood identification process is very in need of the availability of wood anatomists, with a limited amount of wood anatomist will affect the result and the length of time to make an identification. This thesis uses an identification system that can classify wood based on species names with a macroscopic image of wood and the implementation of the Convolutional Neural Network (CNN) method as a classification algorithm. Supporting architecture used is AlexNet, ResNet, and GoogLeNet. Architecture is then compared to a simple CNN architecture that is made namely Kayu30Net. Kayu30Net architecture has a precision performance value reaching 84.6%, recall 83.9%, F1 score 83.1% and an accuracy of 71.6%. In the wood species classification system using CNN, it is obtained that AlexNet as the best architecture that refers to a precision value of 98.4%, recall 98.4%, F1 score 98.3% and an accuracy of 96.7%.

1 Introduction

Wood is one of the most valuable forest resources and is needed by humans. Based on the result of the study of Kartasujana & Martawijaya (1979), that around 4000 species of wood grow in Indonesia with different names and characteristics based on their anatomical structure [1]. Hundreds of them are the species of wood traded. The species of wood have different names and characteristics. Different characteristics are used to determine the exact quality or usefulness of each species of wood.

Identification of species, especially texture, is the initial stage in wood processing.

The Wood Anatomy Laboratory of the Forest Product Research Center carries out macroscopic and microscopic identification of wood from the Xylarium Bogoriense 1915 which has been collected from all regions in Indonesia since 1914 [1]. The Xylarium have the largest wood collection in the world with a total collection of around 193.000 wood samples [2]. Data on macroscopic and microscopic features are needed for accurate identification. The identification of wood requires a high level of accuracy.

Until now, the identification process can only be carried out by trained and experienced wood anatomists. Individual conditions highly determine the outcome and duration of identification. Therefore, the result tends to ineffective in terms of accuracy and time.

Not only considering on these limitations, the identification errors will also cause

(2)

Anindita Safna et al. , Classification of Wood Species: ... 275

financial losses.

To ease the process identification of wood is reliable and accurate, a wood identification system based on digital image processing. Many previous studies have designed wood identification with various methods, namely in the study [3] using the LBP method found that 4 species of wood delivered good accuracy with a rate of 87.5%-100% while the other two species did not match the expected accuracy. Another study with the SVM method resulted in an accuracy rate of 95.83% [4].

Convolutional Neural Network (CNN) is one method of deep learning that can be used for object image classification. Deep learning is one part of the machine learning led by strong computational factors, large datasets, and techniques for training deeper networks [5]. CNN can be implemented to various image resolutions, computation very detailed so the error rate is small, solve data that has high complexity with many calculated parameters, especially able to identify known and unknown forms of data [5]. The accuracy gained by the CNN model on average reaches an accuracy level above 90%. In research [6] can classify characters by 95.4%.

2 Methodology

2.1 Convolutional Neural Network (CNN) Algorithm

CNN is usually used for image classification, grouping CNN is usually used for image classification, classifying it according to similarity and doing object recognition. CNN consists of several layers and is designed for effective complex image recognition. The main advantage of CNN is that it automatically detects important features by studying/extracting/translating features from datasets such as images, videos or text [7].

In the process of identification and classification on CNN, many layers are used to obtain accurate identification results. The number of layers forms an architecture which is then used to recognize an object. Mainly explore three CNN architectures (AlexNet [8], GoogLeNet [9] and ResNet [10]) with different model training parameter values and also build simple CNN called Kayu30Net to compare the result of identification.

a. AlexNet: The AlexNet architecture was published in [8], The architecture created by Alex Krizhevsky shows very significant results in the test data with a test error of 15.31% [8]. These results are considered very extraordinary because the images on the dataset used are very complex and numerous. AlexNet has five convolution layers, three pooling layers, and two fully-connected layers with about 60 million free parameters.

Fig. 1. AlexNet Architecture [8]

b. GoogLeNet: GoogLeNet, introduced in [9]. The winner of ILSVRC in 2014 with an error rate of around 6.67% [9]. GoogLeNet is also called Inception-v1 because there are v2, v3 and v4 newer. This architecture is significantly more complex and deeper than all CNN architectures. GoogLeNet introduces a new module called "inception"

that combines filters of various sizes [11], which can be seen in Figure 3. Filter size

(3)

276

JITeCS Volume 4, Number 3, Desember 2019, pp 274-283 consists of 3x3, 5x5, 1x1 filters, all filters are combined which are then combined to form input from the next stage.

Fig. 2. GoogLeNet Architecture [9]

Fig. 3.Illustration of Inception layer of GoogLeNet

c. ResNet: The ResNet model proposed in [10]. ResNet (Residual Network) uses a residual connection on the network. This architecture won the ILSVRC 2015 competition with an error rate of 3.57%. ResNet is already considered "very deep" in processing data, although its depth is highly correlated with better results. However [10] noticed that the increase in the number of neural network layers began to decrease accuracy in validation data. Therefore, it can be concluded that the problem is not overfitting from the network.

Fig. 4.ResNet Architecture

It is recommended if CNN has reached the accuracy limit on several layers, then all subsequent layers must be transformed into identical transformations, but because the complexity of deep network training does not occur [10]. To help a network, it is proposed to introduce a shortcut connection, see Figure 5. This shortcut connection is useful for retrieving features directly from the previous layer so that the image analysis on [12][13] shows that ResNet can be considered an ensemble consisting of smaller residual neural networks whose effective depth increases in the training process.

(4)

Fig. 5.Shortcut Connection [14]

d. Kayu30Net: This CNN network architecture is made by itself with the name Kayu30Net. In Figure 6, there are two convolution layers, the pooling process uses max pooling. There are three fully-connected layers. The relay activation function is placed after the convolution process to accelerate convergence. At the end of the layer, the activation function used is SoftMax for the need for multiclass classification.

Fig. 6.Kayu30Net Architecture for 30 Classes

3 Result and Discussion

3.1 Dataset

In this research, the methods are implemented by using MATLAB R2018b. The dataset contains data that will be processed at the next step, the data is the species of wood that will be used for the classification process. CNN will run well when using many training data. The dataset used 30 species of wood surface images. The wood samples are obtained from the Xylarium Bogoriense 1915 collection. The Xylarium is administered by the Research and Development Center for Forest Products, Ministry of Environment and Forestry, Republic of Indonesia. Each species of wood will be divided into two groups according to development process i.e training and test data. Data is received by combining an additional loupe of 60× magnification and digital magnification provided by the application at 3.5×. This combination gives a total magnification of 210×. In the process of the test data, the preprocessing step is carried out, which is to change the dimensions of the input data. The data dimension is resized to fit the required size of the CNN architectures.

Fig. 7.Macroscopic Wood Image Samples

12672 17498 23673 N4127

(5)

278

JITeCS Volume 4, Number 3, Desember 2019, pp 274-283

3.2 Testing Wood Classification System

In this section, an evaluation will be displayed on various CNN architectural models.

The desired level of accuracy is close to 98%-99% if the classification results show a picture with the correct wood species.

3.3 System Performance

Performance measurements are presented to determine the performance of classifications of different architectures, the parameters used for performance are accuracy, precision, recall and F score using confusion matrix. The confusion matrix provides decisions obtained in testing.

3.4 Analysis Test Results

The authors conducted two scenarios. The first scenario uses Kayu30Net architecture to develop a classification model from the simplest step. In the second scenario, well- known architectures, such as (AlexNet, GoogLeNet, and ResNet) are implemented on the wood dataset.

3.4.1 First Scenario

Table 1.Test Result of Kayu30Net Architecture

The results of the first scenario testing get a percentage of 0% at a minimum epoch 4 value starting from precision, recall, F1 score, and accuracy. The yield of 0% in Epoch 4 because there are wood species that are not properly classified by species. The average accuracy values obtained at Epoch 4, 7 and 10 are 55.6%, 71.6%, and 66.1%

respectively. The expected level of accuracy for Kayu30Net is not too significant and still has not reached the desired level of accuracy. This is due to overfitting because it doesn't have a dropout layer to prevent this. The best test results are located at Epoch 7 with an average precision of 84.6%, recall 83.9%, F1 score 83.1% and accuracy of 71.6%.

The test on Epoch 7 has a higher performance rate but the system performance results are not above the specified provisions. Therefore, the classification of wood species has an accuracy of close to 98% - 99% and has the best performance, testing evaluation is carried out using architectures commonly used on CNN. The architecture that will be used includes AlexNet, ResNet and GoogleLet.

3.4.2 Second Scenario

The second test is an evaluation of the first scenario by testing and comparing performance results on the AlexNet, ResNet, and GoogleNet. After knowing the results

Kayu30Net Architecture

Epoch 4 Epoch 7 Epoch 10

Min Avr Max Min Avr Max Min Avr Max

Precision 0% 74.4% 100% 57.7% 84.6% 100% 44.7% 80.8% 100%

Recall 0% 72.4% 100% 42.9% 83.9% 100% 26.7% 73.5% 100%

F1 Score 0% 71.1% 100% 54.5% 83.1% 100% 40% 73.4% 100%

Accuracy 0% 55.6% 95.3% 29.2% 71.6% 99.7% 16.8% 66.1% 99.9%

(6)

of the performance of each architecture tested one architecture will be chosen that can classify wood based on its species with the highest level of accuracy and performance.

a. AlexNet

Table 2.Test Result of AlexNet Architecture

From the table above the percentage shown in the AlexNet architecture is moderately good. The best test results are located at epoch 10 with an average precision of 98.4%, recall 98.4%, F1 score 98.3% and accuracy of 96.7%. Testing gets 100% percentage at maximum epoch 4, 7 and 10 values starting from precision, recall, F1 score, and accuracy. This shows that the average wood species successfully classified according to its species. The best test results are located at epoch 10 with an average precision of 98.4%, recall 98.4%, F1 score 98.3% and accuracy of 96.7%. The average accuracy rates obtained at Epoch 4, 7 and 10 are 90.7%, 96.3%, and 96.7% respectively.

b. GoogLeNet

The percentage shown on GoogLeNet architecture is equitably good. The best test results are placed at Epoch 10 with average precision reaching 94.7%, recall 93.75%, F1 score 93.9% and accuracy of 90.41%. Tests get 100% percent at maximum epoch 4, 7 and 10 values starting from precision, recall, and F1 score. This shows that almost all wood species have been successfully classified according to their species. The average accuracy values obtained at Epoch 4, 7 and 10 are 86.28%, 90.41%, and 90.41%

respectively.

Table 3.Test Result of GoogLeNet Architecture

c. ResNet

Table 4.Test Result of GoogLeNet Architecture AlexNet Architecture

Precision 71.4% 95.3% 100% 84.2% 98.2% 100% 88.2% 98.4% 100%

Recall 50% 95.4% 100% 85.7% 98.5% 100% 85.7% 98.4% 100%

F1 Score 66.7% 94.7% 100% 91.4% 98.3% 100% 92.3% 98.3% 100%

Accuracy 38.4% 90.7% 100% 77.2% 96.3% 100% 84.5% 96.7% 100%

GoogLeNet Architecture

Precision 77.1% 94.3% 100% 73.7% 96.1% 100% 66.7% 94.7% 100%

Recall 73.7% 93.2% 100% 77.8% 95.8% 100% 68.8% 93.75 100%

F1 Score 81.3% 93.3% 100% 82.4% 95.7% 100% 73.3% 93.9% 100%

Accuracy 61.39% 86.28% 99.64% 59.69% 90.41% 99.93% 63.82% 90.41% 99.89%

ResNet Architecture

Precision 88.2% 98.4% 98% 84.2% 98.6% 98% 88.2% 99.1% 98.7%

Recall 85.7% 98.2% 100% 89.5% 98.5% 100% 85.7% 99% 100%

F1 Score 92.3% 98.2% 100% 91.4% 98.5% 100% 92.3% 99% 100%

Accuracy 69.1% 91.4% 99.4% 79.3% 94.5% 99.9% 83.5% 95.8% 99.9%

(7)

280

JITeCS Volume 4, Number 3, Desember 2019, pp 274-283 The percentage shown in the ResNet architecture is reasonably good. The best test results are placed at Epoch 10 with an average precision of 99.1%, recall of 99%, F1 score of 99% and accuracy of 95.8%. Tests get a percentage of 100% at maximum epoch 4, 7 and 10 at recall and F1 score. This shows that all wood species were successfully classified according to their species. The average accuracy values obtained at Epoch 4, 7 and 10 are 91.4%, 94.5%, and 95.8% respectively.

4 Comparison of Best Accuracy of Each Architecture

Table 5.Accuracy of Architecture Based on Species No Species

Code

Trade Name

Accuracy (%)

Kayu30Net AlexNet ResNet GoogLeNet

1 906 Surian 97.4 99.7 98.8 98.92

2 7012 Jabon/

Cadamba 47.6 99.7 83.5 83.37

3 18212 Tembesu 84.8 100 99.1 96.47

4 2055 Jati 86.4 99.7 97.2 98.49

5 2059 Salimuli 75.7 100 96.6 86.24

6 4426 Ampupu,

Leda 79.2 95.2 99.2 97.09

7 4991 Kupang 43.6 98.4 96.5 90.96

8 6867 Terap 75.2 93.5 90.7 88.47

9 7125 Ketapang 90.7 99.8 99.7 91.88

10 7532 Tepis 96.9 99.8 99.9 97.87

11 8104 Membacang/

Machang 95.1 100 99.6 99.08

12 8110 Sindur 53.7 98.0 87.4 63.82

13 8313 Bongin 29.2 94.2 96.4 96.85

14 9552 Pasang 63.5 99.5 95.8 98.37

15 12672 Menjalin 92.6 100 98.8 99.89

16 14734 Cempaka 66.9 93.5 97.3 80.01

17 17498 Meranti Putih 66.9 98.9 99.0 97.65

18 19081 Mindi 57.5 88.2 90.9 67.63

19 21268 Kecapi 40.4 94.2 87.4 94.06

20 23255 Sonokembang 85.1 100 99.3 99.01

21 23673 Balau 90.7 84.5 83.8 87.08

22 28163 Mersawa 99.7 96.5 99.8 94.41

23 32145 Gmelina 30.9 85.5 91.4 85.65

24 34207 Bayur 56.9 95.6 96.4 64.02

25 N4127 Balsa 91.8 100 98.7 99.12

26 N4528 Ekaliptus 92.5 99.1 97.4 91.59

27 MRB001 Merbau 34.7 86.3 95.5 98.26

(8)

72%

97% 96% 90%

83% 98% 99% 94%

77.4%

97.2% 97.4% 92.2%

0%

20%

40%

60%

80%

100%

120%

Kayu30Net AlexNet ResNet GoogLeNet Accuracy F1 Score Average Acc & F1 Score 28 N4932 Meranti

Merah 51.0 100 99.7 92.71

29 SNG001 Sengon 90.7 100 99.5 86.49

30 34013 Bipa 80.5 99.7 99.1 91.03

Based on a test result of thirty species as shown in Table 5, AlexNet can achieve good accuracies (> 90%) at 19 species. This number (19 species) is the highest value compared to the results given by the other architectures. There are several cases that the AlexNet accuracy is less than the other architectures. However, the accuracy difference compared to the other architectures is not higher than 2.59%. The classification of wood species on AlexNet has perfect accuracy, i.e 100% in the species Tembesu, Salimuli, Membacang, Menjalin, Sonokembang, Balsa, Meranti Merah, and Sengon.

In Kayu30Net architecture the best species of wood is Balau with an accuracy of 90.7%. This type of wood has the highest accuracy value from the results of another architectural accuracy. From the results of testing thirty species of only ten species have accuracy above 90%.

‘

Fig. 8. Example Types of Classification Errors in Bongin Wood

For example in Figure 8, Bongin wood which produces an accuracy of 29.2% is the lowest in Kayu30Net in the classification process which detects other species of wood, specifically Balau. If the image of wood is seen visually there are several things from anatomical structures that have similarities with other species, resulting in errors in classifying Bongin wood species. The similarity that can be observed is that the shape of the vessels both has the shape of the paratracheal parenchyma and the spacing of the circumference is narrow.

ResNet architecture is highest in 8 species compared to other architectures. The overall results of the accuracy in this species have not yet reached the perfect accuracy value of 100%. Whereas on GoogLeNet Architecture, it has the two highest species compared to other architectures.

4.1 Final Comparison

Fig. 9. Accuracy and F1 score Comparison Charts Bongin - 8313 Balau - 23673

(9)

282

JITeCS Volume 4, Number 3, Desember 2019, pp 274-283

* Training at HPC LIPI

Table 6.Comparison Based on Training Time

Table 7.Comparison Based on Numbers of Layers

5 Conclusion

Based on the results of the research and discussions that have been conducted, classification using Convolutional Neural Network can classify wood based on its species. The average value of accuracy and F1 score is 66.60% on Kayu30Net, 97.15%

AlexNet, 97.40% ResNet and 92.16% GoogleLet. ResNet has a better value on the average value of accuracy and an F1 score of 97.40%. The results of the average value are not too far from the average value of AlexNet (Δ% Average Acc & F1 = 0.2).

ResNet architecture has the highest F1 score which is a combination of precision and recall, in performance in both categories ResNet does have a higher value than AlexNet.

It can be seen in Table 5 the results of the classification of wood species ResNet accuracy values only have four species with an accuracy below 90%, so the results of precision and recall on ResNet architecture are the highest.

When training data and the number of layers also determine the selection of architecture. Compared to ResNet, the AlexNet architecture training process data is faster and the number of layers used is less. Although the results of the accuracy have not reached 98% - 99%, however, each wood species has been classified according to its species and has the highest accuracy value in 19 species compared to other architectures. Therefore, the results of the most significant architectural system performance testing for wood species classification using CNN is AlexNet architecture.

AlexNet's performance value has an average precision of 98.4%, recall 98.4%, F1 score of 98.3% and accuracy of 96.7%. From the analysis, the AlexNet architecture can be seeded and has been able to classify wood. The entire wood species have an accuracy of 97%, but the training time needed takes quite a long time.

Acknowledgement

Computation works of algorithm implementation is conducted by using High Performance Computing facilities at Indonesian Institute of Sciences (LIPI). The data

Epoch Training Time (hh:mm:ss)

Kayu30Net AlexNet ResNet GoogLeNet

4 02h 23m 47s 08h 45m 14s 12h 30m 35s 19h 06m 46s 7 04h 15m 13s 16h 42m 48s 21h 20m 20s 32h 53m 12s 10 05h 44m 22s 23h 35m 36s 27h 50m 14s 25h 08m 34s^*

Architecture Number of Layers

Kayu30Net 14

AlexNet 25

ResNet 177

GoogLeNet 144

(10)

collection activities were supported by The Ministry of Research, Technology and Higher Education, Republic of Indonesia, under Research Grant INSINAS, from 2017 to 2018. The authors would like to thank Ratih Damayanti and Listya Mustika Dewi (Forest Products Research and Development Center, Ministry of Environment and Forestry, Republic of Indonesia) for their support during the data collection.

References

[1] I. Kartasujana and Suherdie, “4000 Jenis Pohon di Indonesia dan Index 4000 Jenis Kayu Indonesia (Berdasar Nama Daerah).” Badan Penelitian dan Pengembangan Kehutanan, 1993.

[2] T. Pulungan, “Terbesar di Dunia, Koleksi Kayu Perkuat Pangkalan Data Cadangan Karbon,” 2018. [Online]. Available:

https://nasional.sindonews.com/read/1360730/15/terbesar-di-dunia-koleksi-kayu- perkuat-pangkalan-data-cadangan-karbon-1544111835.

[3] E. Prakasa, H. F. Pardede, Y. Rianto, R. Damayanti, Krisdianto, and L. M. Dewi,

“Development of Computer Vision Methods for Wood Identification,” no. September, 2017.

[4] A. . G. R. Gunawan, S. R. I. Nurdiati, and Y. Arkeman, “Identifikasi Jenis Kayu Menggunakan Support Vector Machine Berbasis Data Citra Wood Type Identification Using Support Vector Machine Based on Image Data,” J. Ilmu Komput. Agri Inform., vol. 3, pp. 1–8, 2014.

[5] I. Goodfellow, Y. Bengio, and A. Courville, “Deep Learning,” An MIT Press B., 2016.

[6] I. Wendianto Notonogoro, “Pengenalan Plat Nomor Indonesia menggunakan Convolutional Neural Network,” 2018.

[7] Mathworks, “Introducing Deep Learning with MATLAB,” Introd. Deep Learn. with MATLAB, p. 15.

[8] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks,” pp. 1097--1105, 2012.

[9] C. Szegedy et al., “Going deeper with convolutions,” arXiv1409.4842 [cs], pp. 1–9, 2014.

[10] K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” CoRR, vol. abs/1512.0, pp. 1–17, 2015.

[11] H.-C. Shin et al., “Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning.,” IEEE Trans. Med. Imaging, vol. 35, no. 5, pp. 1285–98, 2016.

[12] A. Veit, M. Wilber, and S. Belongie, “Residual Networks Behave Like Ensembles of Relatively Shallow Networks,” pp. 1–9, 2016.

[13] Y. Wu et al., “Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation,” pp. 1–23, 2016.

[14] O. Russakovsky et al., “ImageNet Large Scale Visual Recognition Challenge,” Int. J.

Comput. Vis., vol. 115, no. 3, pp. 211–252, 2015.