DEVELOPMENT OF A CHILD DETECTION SYSTEM WITH ARTIFICIAL INTELLIGENCE (AI) USING OBJECT DETECTION
METHOD
Lai Suk Na
Bachelor of Engineering with Honours (Mechanical and Manufacturing Engineering)
2018
FACULTY OF ENGINEERING
FYP REPORT SUBMISSION FORM
Name : ___________________________________________ Matric No. : _______________________
Title : _________________________________________________________________________________
_______________________________________________________________________________________
Supervisor : ___________________________________ Program: ________________________________
Please return this form to the Faculty of Engineering office at least TWO WEEKS before your hardbound report is due.
Students are not allowed to print/bind the final report prior to Supervisor’s Approval (Section B).
The Faculty reserves the right to reject your hardbound report should you fail to submit the completed form within the stipulated time.
A. REPORT SUBMISSION (To be completed by student) I wish to submit my FYP report for review and evaluation.
Signature: ___________________________ Date: ______________
\
B. SUPERVISOR’S APPROVAL (To be completed by supervisor)
The student has made necessary amendments and I hereby approve this thesis for binding and submission to the Faculty of Engineering, UNIMAS.
Signature: ___________________________ Date: ______________
Name: ______________________________________________________
Lai Suk Na
Development of a Child Detection System with Artificial Intelligence (AI) using Object Detection Method 47241
Ir Dr David Chua Sing Ngie Mechanical and Manufacturing Engineering
UNIVERSITI MALAYSIA SARAWAK
Grade:
DECLARATION OF ORIGINAL WORK
This declaration is made on the 22nd day of June 2018.
Student’s Declaration:
I, LAI SUK NA, 47241, FACULTY OF ENGINEERING hereby declare that the work entitled, DEVELOPMENT OF A CHILD DETECTION SYSTEM WITH ARTIFICIAL INTELLIGENCE (AI) USING OBJECT DETECTION METHOD is my original work. I have not copied from any other students’
work or from any other sources except where due reference or acknowledgement is made explicitly in the text, nor has any part been written for me by another person.
22nd June 2018
Date submitted Lai Suk Na (47241)
Supervisor’s Declaration:
I, IR DR DAVID CHUA SING NGIE hereby certifies that the work entitled, DEVELOPMENT OF A CHILD DETECTION SYSTEM WITH ARTIFICIAL INTELLIGENCE (AI) USING OBJECT DETECTION METHOD was prepared by the above named student, and was submitted to the FACULTY OF ENGINEERING as a partial fulfillment for the conferment of BACHELOR OF MECHANICAL AND MANUFACTURING ENGINEERING (Hons.), and the aforementioned work, to the best of my knowledge, is the said student’s work
Received for examination by: _ Date:
(Ir Dr David Chua Sing Ngie)
Please tick ()
Final Year Project Report Masters
PhD
22nd June 2018
I declare this Project/Thesis is classified as (Please tick (√)):
CONFIDENTIAL (Contains confidential information under the Official Secret Act 1972)*
RESTRICTED (Contains restricted information as specified by the organisation where research was done)*
OPEN ACCESS
Validation of Project/Thesis
I therefore duly affirmed with free consent and willingness declared that this said Project/Thesis shall be placed officially in the Centre for Academic Information Services with the abide interest and rights as follows:
This Project/Thesis is the sole legal property of Universiti Malaysia Sarawak (UNIMAS).
The Centre for Academic Information Services has the lawful right to make copies for the purpose of academic and research only and not for other purpose.
The Centre for Academic Information Services has the lawful right to digitise the content to for the Local Content Database.
The Centre for Academic Information Services has the lawful right to make copies of the Project/Thesis for academic exchange between Higher Learning Institute.
No dispute or any claim shall arise from the student itself neither third party on this Project/Thesis once it becomes sole property of UNIMAS.
This Project/Thesis or any material, data and information related to it shall not be distributed, published or disclosed to any party by the student except with UNIMAS permission.
Student’s signature Supervisor’s signature:
(22nd June 2018) (22nd June 2018)
Current Address:
NO 82 TAMAN LANDEH JALAN LANDEH 93250 KUCHING SARAWAK
Notes: * If the Project/Thesis is CONFIDENTIAL or RESTRICTED, please attach together as annexure a letter from the organisation with the period and reasons of confidentiality and
restriction.
[The instrument was duly prepared by The Centre for Academic Information Services]
√
APPROVAL SHEET
This project report which entitled “DEVELOPMENT OF A CHILD DETECTION SYSTEM WITH ARTIFICIAL INTELLIGENCE (AI) USING OBJECT DETECTION METHOD” was prepared and submitted by LAI SUK NA (47241) as a partial fulfilment of the requirement for degree of Bachelor of Engineering with Honours in Mechanical and Manufacturing Engineering is hereby read and approved by:
_____________________
Ir Dr David Chua Sing Ngie (Project Supervisor)
____________________
Date
DEVELOPMENT OF A CHILD DETECTION SYSTEM WITH ARTIFICIAL INTELLIGENCE (AI) USING OBJECT DETECTION
METHOD
LAI SUK NA
A dissertation submitted in partial fulfillment of the requirement for the degree of Bachelor of Engineering with Honours (Mechanical and Manufacturing Engineering)
Faculty of Engineering Universiti Malaysia Sarawak
2018
To my beloved family and friends.
i
ACKNOWLEDGEMENTS
First and foremost, I would like to express my deepest gratitude to my supervisor, Ir Dr David Chua Sing Ngie from Mechanical and Manufacturing Engineering Department for his guidance, advice and contribution of ideas throughout this project. I would also like to thank Google Inc for making object detection model and Tensorflow library open source which enable students and researchers to develop in the field of Artificial Intelligence and machine learning. I sincerely thank Google Images for allowing the usage of images for training purpose. Last but not least, I would like to express my heartiest thank to my family and friends for their support and corporation.
With all their accompany this thesis become reality.
ii
ABSTRACT
The issue of children dying due to vehicular heatstroke has raised the public attention. The failure of current vehicular occupant detection devices to identify correctly the occupant as a child had triggered the idea of developing a child detection system using Artificial Intelligence (AI) technology. The usage of Convolutional Neural Network (CNN) had been recognised as an effective way to perform image classification. However, this approach required a significant number of images as training data and substantial time for model training in order to achieve desired results in accuracy. Due to the limitation of abundant dataset, transfer learning was used to accomplish the task. Modern convolutional object detector, SSD Mobilenet v1 trained on Microsoft Common Objects in Context (MS COCO) dataset was used as a starting point of the training process. MS COCO dataset that consisted of a total of 328k images were divided into 91 different categories including dog, person, kite and so on. The trained model was then retrained to classify adults and children instead of persons. At the end of the training, a real-time child detection system was established. The system was able to give different responses to the detection of a child and adult. The responses comprised of visual and audio outputs. Upon detection, a bounding box was drawn on a child or an adult face as visual output. At the same time, the system would trigger the speaker to speak out the statement “child is detected” for successful child detection whereas adult detection would result in the statement of “adult is detected”. Theoretically, the detection system could achieve an overall precision of 0.969. However, the experimental results obtained was able to match up to a precision of 0.883 that resulted in a small error of 8.88%.
iii
ABSTRAK
Isu kanak-kanak maut dalam kenderaan kerana strok haba telah menimbulkan perhatian orang ramai. Kegagalan alat pengesan di pasaran untuk mengenal pasti kehadiran penumpang kenderaan adalah anak dengan tepat telah mencetuskan idea untuk membangunkan sistem pengesanan kanak-kanak menggunakan teknologi Kecerdasan Buatan (AI). Penggunaan rangkaian neural konvolusi (CNN) telah diakui sebagai cara yang berkesan untuk melakukan klasifikasi imej. Walau bagaimanapun, pendekatan ini memerlukan imej yang banyak sebagai data latihan dan masa yang panjang untuk latihan supaya dapat mencapai ketepatan yang dikehendaki. Oleh sebab batasan dataset yang banyak, pembelajaran pemindahan digunakan untuk menyelesaikan tugas. Pengesan objek konvolusi moden, SSD Mobilenet v1 yang dilatih dalam dataset Microsoft Common Objects Context (MS COCO) digunakan sebagai titik permulaan proses latihan.
Dataset MS COCO mengandungi sejumlah 328 ribu imej dibahagikan kepada 91 kategori yang berbeza termasuk anjing, orang, layang-layang dan sebagainya. Model ini dilatih untuk mengklasifikasikan dewasa dan kanak-kanak yang pada asalnya mengenali kanak- kanak dan orang dewasa sebagai orang. Pada akhir latihan, sistem pengesanan masa nyata ditubuhkan. Sistem memberi respons yang berbeza kepada kanak-kanak dan orang dewasa. Maklum balas terdiri daripada output visual dan audio. Secara teorinya, sistem pengesanan dapat mencapai ketepatan keseluruhan 0.969 sedangkan hasil eksperimen memberikan ketepatan 0.883, memberikan kesilapan sebanyak 8.88%.
iv
TABLE OF CONTENTS
Acknowledgements………..……...……… i
Abstract………...………...………. ii
Abstrak.………...… iii
Table of Contents………...…. iv
List of Tables………...……….. vi
List of Figures………....…………...……….. vii
List of Abbreviations………....………...……... ix
Chapter 1 INTRODUCTION 1.1 General Background………...……… 1
1.2 Problem Statement………...…………... 4
1.3 Objectives………...……… 5
1.4 Scope of Research………....……….. 5
Chapter 2 LITERATURE REVIEW 2.1 History of Artificial Intelligence (AI)……… 6
2.2 Machine Learning (ML)………...……... 7
2.2.1 Supervised Learning………....……….. 7
2.2.2 Unsupervised Learning………...………... 11
2.2.3 Reinforcement Learning……… 12
2.3 Deep Learning (DL)……….……... 14
2.3.1 Perceptron……….. 14
2.3.2 Deep Neural Network……… 16
2.3.2.1 Convolutional Neural Network………... 17
2.3.2.2 Recurrent Neural Network……….. 18
2.4 Pretrained Models………...……… 18
2.5 Related Works………...………... 19
Chapter 3 METHODOLOGY 3.1 Introduction………...…..………. 23
3.2 Research Methodology………...……….…...……… 23
3.3 Hardware……… 24
3.4 Python Programming and Python Integrated Development……...… 24
3.5 System Design Flow Chart……….…… 25
3.6 Flow Chart of Methodology………...…… 26
3.7 Object Detection using SSD Mobilenet v1………..…….. 27
3.8 Dataset Pre-processing………..……. 29
3.8.1 Collection of Dataset………...……... 29
3.8.2 Annotation………...……... 30
v
3.8.3 Binarization………...……. 31
3.9 Training………..…… 31
3.9.1 Compute Loss……….………... 31
3.10 Testing……….………. 32
3.10.1 Compute Precision and Evaluate Performance of Model Checkpoints………...………... 33
3.11 Export Retrained Model………...…… 34
3.12 Completion of Child Detection System……… 35
3.12.1 Types of Inputs………..…….. 35
3.12.2 Classifier………..…… 35
3.12.3 Triggering System………..…. 36
3.13 Experimental Evaluation of Child Detection System…………..… 36
3.13.1 Response Time………...….. 36
3.13.2 Maximum Distance of Detection………...….. 37
3.13.3 Precision………...…… 37
3.14 Project Management………...…….. 38
Chapter 4 RESULT AND DISCUSSION 4.1 Introduction………..…... 39
4.2 Object Detection on Jupyter Notebook……….. 39
4.3 Detection with SSD Mobilenet v1………...……... 40
4.4 Loss Graph………...………..……… 41
4.5 Model Evaluation………..………. 42
4.6 Experimental results ………...……...………...……. 46
4.6.1 Response Time………...…… 47
4.6.2 Maximum Distance of Detection………...…… 48
4.6.3 Precision………...….. 56
4.7 Discussion………..…… 61
4.8 Sources of Error………..… 61
Chapter 5 CONCLUSIONS AND RECOMMENDATIONS 5.1 Conclusions………...…………...……… 62
5.2 Recommendations………..…… 62
REFERENCES 63
APPENDIX A 68
APPENDIX B 72
APPENDIX C 73
APPENDIX D 75
APPENDIX E 78
APPENDIX F 82
APPENDIX G 84
APPENDIX H 85
vi
LIST OF TABLES
Table Page
Table 1.1 Comparison between the Advantages and Disadvantages of
Current Vehicular Occupant Detection Systems in the Market 4 Table 2.1 Difference between Machine Learning Tasks 13 Table 2.2 Speed and Accuracy of Pretrained Model 19
Table 3.1 Pretrained Model 28
Table 3.2 Distribution of Training and Testing Dataset 30
Table 3.3 Gantt Chart for FYP1 and FYP2 38
Table 4.1 Sample Ground Truth and Computed Data 42
Table 4.2 Result of Testing at Step 19608 46
Table 4.3 Response Time for 6 Candidates 47
Table 4.4 Child Detection with Varying Distances 49 Table 4.5 Adult Detection with Varying Distances 54 Table 4.6 Reproducibility Test of the Child Detection 58 Table 4.7 Reproducibility Test of the Adult Detection 58
Table 4.8 Terminology in Confusion Matrix 59
Table 4.9 Confusion Matrix of Child 59
Table 4.10 Confusion Matrix of Adult 59
Table 4.11 Theoretical and Experimental Precisions 60
vii
LIST OF FIGURES
Figure Page
Figure 1.1 Circumstances of Child Vehicular Heatstroke Death in
United States ( 1998 – 2016) 2
Figure 1.2 Sense-A-Life Vehicular Occupant Detection System 2 Figure 1.3 Hyundai Rear Occupant Alert System 3 Figure 2.1 Relationship between AI, ML and DL 7 Figure 2.2 Procedure for Building a Classification Model 9
Figure 2.3 Procedure in a Regression Task 10
Figure 2.4 Construction of Regression Line based on Samples 11 Figure 2.5 Characteristics of Machine Learning Models 13 Figure 2.6 The Processing Steps in a Perceptron 15
Figure 2.7 Architecture of ANN 16
Figure 2.8 Difference between Nueral Network and DNN 16 Figure 2.9 Hierarchy Feature Extraction in CNN 17
Figure 2.10 Architecture of RNN 18
Figure 3.1 Python IDLE Version 3.6 24
Figure 3.2 Logic Diagram of Child Detection System 25
Figure 3.3 Procedures in Methodology 26
Figure 3.4 Process Flow to Accomplish Object Detection Method 27
Figure 3.5(a) Sample Adult Images 30
Figure 3.5(b) Sample Child Images 30
Figure 3.6(a) Labelling Images 30
Figure 3.6(b) XML File 30
Figure 3.7 Command Window 31
Figure 3.8 Loss Values 32
Figure 3.9 Running Evaluation Command 33
Figure 3.10 The Computed Precision for Model Checkpoint at Step
19613 33
Figure 3.11 Child Detection System 35
viii
Figure 3.12 Experiment Setup Diagram 37
Figure 4.1 Results of Object Detection 39
Figure 4.2 Sample Image for Detection 40
Figure 4.3(a) Detection Outcome based on SSD_Mobilenet_v1 40 Figure 4.3(b) Detection Outcome based on Retrained model 40
Figure 4.4 Loss Graph during Training 41
Figure 4.5
Relationship between Predicted Bounding Box, 𝐵𝑝 and
Ground Truth Bounding Box, 𝐵𝑔𝑡 42
Figure 4.6 Performance of Model Checkpoints by Category (Adult) 44 Figure 4.7 Performance of Model Checkpoints by Category (Child) 44 Figure 4.8 Mean Average Precision (mAP) of Model Checkpoints 45
Figure 4.9 Setup of Experiment 47
Figure 4.10 Response Time for 6 Candidates 48
Figure 4.11 Results of Child Detection 57
Figure 4.12 Results of Adult Detection 57
ix
LIST OF ABBREVIATIONS
AI - Artifiial Intelligence ML - Machine Learning DL - Deep Learning
ANN - Artificial Neural Network DNN - Deep Neural Network
CNN - Convolutional Neural Network RNN - Recurrent Neural Network
IDLE - Integrated Development Learning Environment XML - Extensible Markup Language
HTML - Hyper Text Markup Language WAV - Waveform Audio
CSV - Comma Separated File
1
CHAPTER 1
INTRODUCTION
1.1 General Background
Heatstroke is defined as a situation whereby the body lost the ability to cool itself due to prolonged exposure to high temperatures (Mayo Clinic, 2017). The symptoms of heatstroke include high body temperature, normally achieved a temperature of 40oC or higher, nausea, vomiting, rapid breathing, racing heart rate, headache and so on. There are children dying from heatstroke each year after being left unattended in vehicles.
According to McLaren (2005), although under a relatively cool ambient temperature, the majority rise in temperature of a parked vehicle occurs at the first 15 to 30 minutes.
According to Kidsandcars (2017), the probability of children suffer from heatstroke in cars is much higher than that of adults. Core body temperature of children rises more rapidly under high temperature due to their greater body surface area to mass ratio as compared to adults. A child can get overheat at around 3 to 5 times faster than an adult.
Statistics show that there is a total of 700 child fatalities due to vehicular heatstroke since 1998 in the United States and 87% of them were aged under 3 years old. Figure 1.1 shows the circumstances of child vehicular heatstroke (Null, 2017).
2
Figure 1.1: Circumstances of Child Vehicular Heatstroke Death in the United States (1998 – 2016)
The issue of children dying of vehicular heatstroke has raised the attention of the public and lead to the invention of various types of devices to help remind drivers or caregivers. The typical product currently available in the market includes Sense-A-Life, a device that utilized pressure sensor to detect the presence of a child, and immediately alert driver that has left a child in the car through a mobile application and speakers. It is designed with simple installation and easy transfer between vehicles.
Figure 1.2: Sense-A-Life Vehicular Occupant Detection System (Adapted from Schlosser, 2016)
54%
28%
17% 1%
Circumstances
Forgotten Gained Access Left Intentionally Unknown
3
However, there are weaknesses in this invention. The usage of the pressure sensor cannot differentiate whether the force applied to it is a human or a load. This will cause the occurrence of false signal generation. Besides, it is designed to be installed in a baby car seat. It cannot detect the presence of a child who gained access to the car on their own.
Another invention is developed by Hyundai Motor named rear occupant alert system. The system implemented ultrasonic sensor to detect motion of child in the rear seats after driver leaves the vehicle and activate triggering system including horns sound, light flash and send text message to the driver. The system can work effectively even the child is not put in the baby seat (Newswire, 2017).
Figure 1.3: Hyundai Rear Occupant Alert System (Adapted from Muller, 2017) However, the system cannot generate the signal if the child is sleeping or not move.
In addition, the system will generate a signal to any motion regardless of whether the vehicular occupant is an adult or a child. This reduces the reliability of the system.
Current products in the market generally comprised of sensors to sense temperature and vehicular occupant detection as well as triggering system. Weaknesses of these products are the ability to detect correctly whether there is a vehicular occupant present in the car and whether he or she is a child or an adult. As such, a study should be done to develop a child detection system which can accurately detect the presence of a child in a car. Application of Artificial Intelligence (AI) technology in image recognition system is the reason for the idea of developing a child detection system to achieve the objectives of this research.
4
Table 1.1: Comparison between the Advantages and Disadvantages of Current Vehicular Occupant Detection Systems in the Market
Advantages Disadvantages
Sense-A-Life Ease to install Pressure sensor cannot differentiate between a child and a load
The region of detection is limited in the baby car seat only Hyundai Rear Occupant
Alert System
Detection is not limited to the baby car seat only
Cannot differentiate the occupant is an adult or a child
Fail to sense the occupant if there is no motion detected
1.2 Problem Statement
The problem encountered currently is the available sensors in the market cannot differentiate accurately the vehicle occupant is an adult or a child. The sensor can detect the presence of vehicular occupant and generate signal to alert car owner. However, the system cannot determine whether the occupant is a child or adult. This will increase the frequency of false signal generation.
The inaccurate sensor such as pressure sensor will activate triggering system to any load applied on it, generating the false signal to users, thus reducing the reliability of the detection system.
On the other hand, problem to be faced in this project is the qualities of input which will affect the accuracy of results. This problem may occur due to motion of children.
Other factor such as brightness, distance and angle from the camera to the target would affect the performance of the system as well.
5 1.3 Objectives
The objectives of this study are:
To provide a framework for the development of a child detection system with AI using object detection method.
To establish an image recognition system that can detect the presence of a child.
To develop a child detection system that is able to reduce the frequency of false signal generation.
1.4 Scope of research
This research focuses on developing an alternative way to detect vehicular occupant instead of using sensors available in the market. The system established will response to the presence of child only. This research will help to study the workability in using image recognition to differentiate a child against an adult. There are several challenges that may not be solved by using image recognition. Firstly, training an image recognition system requires a large amount of images as training data. Besides, training an image recognition model required the useage of Graphics Processing Unit (GPU) in order to speed up the process and obtain a more accurate retrained model.
6
CHAPTER 2
LITERATURE REVIEW
2.1 History of Artificial Intelligence
The field of Artificial Intelligence (AI) was coined by John McCarthy and two senior scientists: Claude Shannon and Nathan Rochester in 1956 at a conference, Dartmouth Conference, the first conference devoted to the subject (Buchanan, 2006).
They proposed that every aspect of learning or any other feature of intelligence can be precisely described, and a machine can be made to simulate it. According to Luger (2009), artificial intelligence (AI) is the science to enable the machine to accomplish the thing that requires intelligence. It promotes the use of a computer to do reasoning. AI can refer to anything from a computer programme playing a game to pattern recognition, image recognition as well as text and speech recognition by utilizing AI technology. The field of AI reached its major advance in the year of 1980, with the emergence of Machine Learning (ML), an approach to achieve AI (Dietterich & Michalski, 1983). Machine learning works on the principle of using algorithms to learn from examples. The machine is trained to learn from a large amount of data by itself, without explicitly programmed.
However, conventional machine learning techniques were limited in their ability to process natural data in their raw form (Luppescu & Romero, n.d.). The weakness had led to the blossom of Deep Learning (DL) in the 2000’s. DL is a set of methods that allow a machine to be fed with raw data and automatically discover the representations needed for detection and classification (Lecun, Bengio, & Hinton, 2015). DL architectures comprised of multiple processing layers to extract a useful representation of data. For example, in image processing, raw images are fed into the learning model. Initial layer extract edge detection, the second layer detects the motif, the third layer may assemble motifs into larger combinations that correspond to parts of familiar objects, and
7
subsequent layers would detect objects as combinations of these parts. In short, AI is any techniques that enable computers to replicate human intelligence. ML is the subset of AI involving the field of study that gives computers the ability to learn without explicitly programmed. Improvement of ML leads to DL, which allow a model to learn the representations of data with multiple levels of abstraction. Figure 2.1 shows the relationship between AI, ML, and DL.
Figure 2.1: Relationship between AI, ML, and DL
2.2 Machine Learning
According to Negnevitsky (2002), machine learning consists of adaptive mechanisms that enable the computer to learn from experiences or data exposed to it. The knowledge is improved by keeping making adjustment and correction on the error signal generated. The most famous machine learning mechanism include artificial neural networks and genetic algorithms. Machine learning is a field of study that focuses on computer systems that can learn from data. These systems are called models. Models can learn to perform a specific task by analysing lots of examples for a particular problem.
There are three modes to conduct machine learning, including supervised learning unsupervised learning and reinforcement learning.
2.2.1 Supervised Learning
A machine learning process whereby the model is to predict the provided output or target variables. In other words, all the target variables are labelled. Classification and regression are examples of the supervised learning problem, which the output variables are either categorical or numeric. Under supervised learning, machine learning can further be divided into classification task and regression task.
AI - Human intelligence exhibited by machines ML - An approach to achieve AI
DL - A technique for implementing ML