TIMBER DEFECT DETECTION BASED ON SYSTEMATIC FEATURE ANALYSIS AND ONE CLASS CLASSIFIER
UMMI RABA’AH BINTI HASHIM
A thesis submitted in fulfilment of the requirements for the award of the degree of
Doctor of Philosophy (Computer Science)
Faculty of Computing Universiti Teknologi Malaysia
DEDICATION
ACKNOWLEDGEMENT
In the name of Allah, most gracious, most merciful. Praise to Allah, for guiding me in the right path, blessing me with the best in this life. It takes the efforts and supports of many to bring this research study to completion. I am indebted to the dozens of people guiding and supporting me throughout this study. I would like to express my gratitude to the following special individuals:
1. My supervisor and co-supervisor, Assoc. Prof. Dr. Siti Zaiton binti Mohd Hashim and Assoc. Prof. Dr. Azah Kamilah Muda, for their wonderful guidance and continuous encouragement during the progression of my study. 2. Academicians of UTM, for their valuable teaching, comment, idea and
motivation for this research.
3. Industry experts from Hasro Malaysia, Teras Puncak and Elegant Success (Malaysian wood products manufacturers) for their co-operation, invaluable consultation and kind support.
4. Universiti Teknikal Malaysia Melaka (UTeM) and Ministry of Education Malaysia for their generous financial support.
5. My husband and children, for their patience and love. 6. My parents and brothers, for their blessing and care.
ABSTRACT
ABSTRAK
TABLE OF CONTENTS
CHAPTER TITLE PAGE
DECLARATION ii
DEDICATION iii
ACKNOWLEDGEMENT iv
ABSTRACT v
ABSTRAK vi
TABLE OF CONTENTS vii
LIST OF TABLES xii
LIST OF FIGURES xiv
LIST OF ABBREVIATIONS xvii
LIST OF APPENDICES xx
TERMS AND DEFINITIONS xxi
1 INTRODUCTION 1
1.1 Overview 1
1.2 Research Background 2
1.3 Problem Statement and Research Aim 13
1.4 Research Objective 14
1.5 Research Scope 14
1.6 Significance of the Study 16
1.7 Research Methodology 17
1.8 Research Contribution 19
2 LITERATURE REVIEW 21
2.1 Introduction 21
2.2 Overview of Timber Process 26
2.3 Malaysian Timber Species 28
2.4 Timber Defects 31
2.5 Automated Vision Inspection (AVI) of Timber 33
2.5.1 Problem Background 33
2.5.2 AVI in Wood Industry 34
2.5.3 Sensors Used for AVI in Wood Industry 39 2.5.4 General Timber Defect Detection Approach 43 2.5.5 Feature Extraction on Defect Images 46
2.5.6 Defect Classification 50
2.5.7 Discussion 53
2.6 Statistical Texture Feature Based on Grey Level
Dependence Matrix (GLDM) 55
2.6.1 Problem Background 55
2.6.2 Orientation Independent GLDM 58
2.6.3 Statistical Features of GLDM 63
2.7 One Class Classification for Imbalanced Data 71 2.7.1 Introduction and Problem Background 71 2.7.2 Distance-based One Class Classifier (OCC) 73 2.7.3 Fast Minimum Covariance Determinant as Robust
Estimator 77
2.8 Summary 81
3 RESEARCH METHODOLOGY 82
3.1 Introduction 82
3.2 Problem Situation and Solution Concept 82
3.3 Research Design 87
3.3.1 Research Framework 87
feature set representing timber defect. 90 3.3.2.3 Phase 3: Development of robust OCC with
FMCD estimator for timber defect detection 91
3.3.3 Overall Research Plan 92
3.4 Evaluation Measurement 95
3.4.1 Multivariate Analysis of Variance (Manova) to
Evaluate Feature Quality 95
3.4.2 Precision, Recall and F Measure to Measure
Detection Performance 100
3.4.3 Over Detection and Under Detection Errors to
Assess Segmentation Quality 102
3.5 Summary 103
4 CONSTRUCTION OF TIMBER SURFACE DEFECT
IMAGE DATASET 104
4.1 Introduction 104
4.1 Timber Samples Collection 106
4.2 Image Acquisition Setup 106
4.3 Image Labelling and Processing 110
4.4 Findings 113
4.5 Summary 116
5 SIGNIFICANT FEATURE SET OF TIMBER SURFACE DEFECTS BASED ON STATISTICAL TEXTURE AND
SYSTEMATIC FEATURE ANALYSIS 117
5.1 Introduction 117
5.2 Overview of Approach 118
5.3.1 Extracting Statistical Features from GLDM 121 5.3.2 Exploring Displacement and Quantization Parameter
of GLDM 127
5.4 Evaluation of Feature Quality 133
5.4.1 Exploratory Feature Analysis 133
5.4.1.1 Univariate Feature Range Analysis 134
5.4.1.2 Bivariate Matrix of Scatter Plot 136
5.4.1.3 Multivariate Intra-Class and Inter-Class
Distance between Clear Wood and Defects 137
5.4.2 Confirmatory Feature Analysis 139
5.4.2.1 Removing Linearly Dependent Features 141 5.4.2.2 Measuring Significant Difference between
Defect Classes using Manova Statistics 143 5.4.2.3 Identifying Significant Features using
Post-hoc Manova (Discriminant Analysis) 145
5.5 Performance Validation 149
5.5.1 Measuring Classification Performance across
Feature Sets and Classifiers 150
5.5.2 Measuring Classification Performance of Individual
Classes 153
5.5.3 Measuring Classification Accuracy across Timber
Species 156
5.6 Discussion 158
5.7 Summary 159
6 ROBUST MAHALANOBIAN CLASSIFIER WITH FMCD ESTIMATOR (MC-FMCD) FOR TIMBER DEFECT
DETECTION 160
6.1 Introduction 160
6.2 Overview of Approach 161
6.4.3 Detection Performance between Classic MD and
Robust MC-FMCD 174
6.4.4 Summary of Detection Performance across Timber
Species 178
6.5 Expert Validation on Test Images 180
6.6 Discussion 185
6.7 Summary 186
7 CONCLUSION AND FUTURE RESEARCH 188
7.1 Summary of Research Finding 188
7.2 Research Contribution 191
7.3 Future Work Recommendation 193
7.4 Concluding Remark 195
REFERENCES 196
LIST OF TABLES
TABLE NO. TITLE PAGE
2.1 List of Malaysian timber classification based on density
(MTIB, 2000) 29
2.2 Natural durability classification based on years (MTIB, 2000) 29 2.3 Characteristics of four types of timber species (MTIB, 2000) 30
2.4 List of common timber defect 32
2.5 Related works on automated inspection of wood products 36 2.6 Related studies on inspection of external wood defects 40 2.7 Images of directional matrices and rotation invariant matrix 61
3.1 Problem leading to solution 86
3.2 Overall research plan 92
3.3 Confusion matrix 102
4.1 List of data collection setting of past studies on timber
surface defect detection 109
4.2 List of classes with example of sub-images collected 114 4.3 Number of samples collection across species 116 5.1 Example of sub-image and the corresponding dependence matrix 123 5.2 List of statistical texture features extracted 124 5.3 Example of extracted features (one sample per class,
species=Meranti, d=1, q=32) 125
5.7 List of features removed after correlation test 143 5.8 Box's test of equality of covariance matrices 144
5.9 Manova test 144
5.10 Pillai’s Trace value across multiple quantization levels and
displacements 145
5.11 Eigenvalues and canonical correlations 146
5.12 Raw and standardized discriminant function coefficients
(Root 1) 147
5.13 Correlation between features and canonical variable 148 5.14 List of remaining features after discriminant analysis 148 5.15 List of feature sets used for performance comparison 150
5.16 Confusion matrices for D7, D5 and D4 154
5.17 Samples mistakenly classified as clear wood (undetected
defect) 155
5.18 Confusion matrices for Merbau, KSK and Rubberwood 157 6.1 Experimental Meranti dataset for various defect ratios 163
6.2 Detection performance by defect ratio 167
6.3 Detection performance by defect types 170
6.4 Detection performance on test images: Rubberwood 181
6.5 Detection performance on test images: KSK 182
LIST OF FIGURES
FIGURE NO. TITLE PAGE
1.1 Motivation of the study 12
1.2 Overview of research phases 18
2.1 Taxonomy of literature review 23
2.2 Timber process 26
2.3 Log cutting pattern (Cavette, 2006; Tom & Jeff, 2010) 27
2.4 The components of an AVI system in wood industry 35
2.5 Reference pixel, X with its 8 neighbouring pixels
(Haralick et al., 1973) 59
2.6 Distribution of non-zero matrix element on the left, and
contour plot showing joint probability density function of
the spatial dependence matrix on the right. 62
2.7 Research solutions to the problem of classification of
imbalanced data (Sun et al., 2009) 73
3.1 Solution concept for timber defect detection 85
3.2 Research framework 88
3.3 Operational research framework 89
4.1 Image acquisition setup 108
4.2 The process of dataset construction 111
4.3 Sample of acquired images 111
4.4 Subdivision of original image into sub-images 113
5.3 Pictorial representation of the orientation independent GLDM 128
5.4 Normalized feature means against displacement and
quantization 131
5.5 Energy feature range analysis 134
5.6 Entropy feature range analysis 135
5.7 Contrast feature range analysis 135
5.8 Scatter plot matrix showing pairwise comparison of features 136
5.9 Intra-class distance between clear wood samples and
inter-class distance between clear wood and defect samples 138 5.10 Procedures for confirmatory feature analysis 140
5.11 Classification accuracy of three proposed feature sets (D6,
D7 and D8) 151
5.12 Classification accuracy between the proposed feature set (D7)
and feature sets from previous studies 152
5.13 F scores for each class across datasets D4, D5 and D7 154
5.14 Classification accuracy across timber species 156
6.1 Flow of experiments for timber defect detection 161
6.2 Proposed MC-FMCD for robust timber defect detection 162
6.3 F score across defect ratio: (a) Meranti, (b) Rubberwood, (c)
KSK, (d) Merbau 168
6.4 OD Error and UD Error across defect ratio: (a) Meranti, (b)
Rubberwood, (c) KSK, (d) Merbau 169
6.5 F score by defect type: (a) Meranti, (b) Rubberwood, (c)
KSK, (d) Merbau 172
6.6 OD Error and UD Error by defect type: (a) Meranti, (b)
Rubberwood, (c) KSK, (d) Merbau 173
6.7 Detection performance for MC-FMCD and classic MD:
6.8 Detection performance for MC-FMCD and classic MD:
Rubberwood dataset 175
6.9 Detection performance for MC-FMCD and classic MD: KSK
dataset 176
6.10 Detection performance for MC-FMCD and classic MD:
Merbau dataset 177
6.11 Average detection performance by timber species 178
6.12 Average detection performance by defect type across timber
species (a) F score comparison between timber species by
defect type (b) Average F score by defect type 179
6.13 Average detection performance between MC-FMCD and
classic MD 180
LIST OF ABBREVIATIONS
ANN - Artificial Neural Network AUTOC - Autocorrelation
AVI - Automated Vision Inspection
BR - Brown Stain
BS - Blue Stain
CAR - Causal Auto Regressive Model CCD - charged-coupled device
CL - Clear Wood
CONT - Contrast
COR - Correlation
CPROM - Cluster Prominence CSHAD - Cluster Shade
CT - Computed Tomography
DENT - Difference entropy DISS - Dissimilarity DVAR - Difference variance
EN - Energy
ENT - Entropy
EPQ - Equal Probability Quantization
FMCD - Fast Minimum Covariance Determinant
FMMIS - Fuzzy Min-Max Neural Network for Image Segmentation
FN - False Negative
FP - False Positive GA - Genetic Algorithm
GPR - Ground Penetrating Radar
HL - Hole
HOMO - Homogeneity
IDMN - Inverse difference moment normalized IDN - Inverse difference normalized
IMC1 - Information measures of correlation 1 IMC2 - Information measures of correlation 2
KN - Knot
KNN - K-nearest Neighbour
KSK - Kembang Semangkuk
LBP - Local Binary Pattern
MANOVA - Multivariate Analysis of Variance MAXPR - Maximum probability
MCD - Minimum Covariance Determinant
MC-FMCD - Mahalanobian Classifier based on Robust FMCD MD - Mahalanobis Distance
MGR - Malaysian Grading Rule
MIDA - Malaysian Investment Development Authority MLP - Multi-layer Perceptron
MSE - Mean Square Error
MTIB - Malaysian Timber Industry Board MVE - Minimum Volume Ellipsoid MVV - Minimum Vector Variance NATIP - National Timber Industry Policy OCC - One Class Classifier
OD - Over Detection
PC - Pocket
RBFN - Radial Basis Function Network RGB - Red Green Blue
RT - Rot
SAVG - Sum Average
SSCP - Sum of Squares Cross Product SVAR - Sum Variance
TN - True Negative
TP - True Positive
UD - Under Detection
LIST OF APPENDICES
APPENDIX TITLE PAGE
A Related studies on inspection of internal wood defects
Related studies on multi sensors approach to timber
defect detection 213
B Example of orientation independent GLDM and
normalized GLDM 216
C Plots of feature value against displacement and
quantization parameter 219
D Univariate feature range analysis 236
E Matrix of scatter plots comparing feature
distribution between classes 247
F Pairwise correlation between features and its
corresponding significance, p value 249
G SPSS Manova output 252
H Experimental dataset for various defect ratios 260
I Expert validation sheet 267
J UTM letter of permission for data collection 280
K Biography of industry experts 284
L Letter of dataset certification 287
M Photo album 291
TERMS AND DEFINITIONS
TERM DEFINITION
Wood A hard fibrous material that makes up most of the substance of a tree
Log A part of the trunk that has been cut off from a felled tree Timber Wood boards sawn from logs
Primary wood
industry Businesses that process logs or other tree sections directly into timber, veneer, plywood, wood chips or other primary wood products.
Sawmill A factory where logs are sawn into timbers Secondary wood
industry Businesses that process primary wood products such as timber into secondary wood products such as furniture, doors, and parquet flooring.
Rough mill The first production area/stage in a secondary wood product industry where timber is being moulded and cut into rough sized components/parts. At this stage, undesirable characteristics or defects are removed.
Defect Flaws or anomalies found on timber that affect its properties and limit its possible use.
Natural defect Biological defects occurred during the growth of a tree where the timber originates from.
Mechanical
defect Defects that are caused by the handling or processing of timber, such as during drying, sawing and moulding. Internal defect Defects that are found inside the timber structure