DRIVER BEHAVIOUR MONITORING AND ALERT SYSTEM BY USING MACHINE LEARNING
Nur Arina Hazwani Samsun Zainii, Fauzun Abdullah Asuhaimiii, Liyana Ramliii&
Khairul Nabilah Zainul Ariffiniii
i Student, Faculty of Engineering and Build Environment, Universiti Sains Islam
ii Senior Lecturer, Faculty of Engineering and Build Environment, Universiti Sains Islam
iii (Corresponding author). Senior Lecturer, Faculty of Engineering and Built Environment, Universiti Sains Islam Malaysia. [email protected]
Abstract
A person who experienced micro-sleep tends to have decreasing concentration in their driving and this is one of the reasons why the number of road accidents system was designed to detect the sypmtom of microsleep which is the drowsiness of the driver by putting emphasis on the driver’s eyes and mouth. The system utilized the machine learning algorithm, MATLAB simulator and a camera.
The camera is pointed directly towards the driver’s face and when the camera captures features like drooping eyelids and yawning, an alarm will automatically set off to alert the driver and the condition of the driver.
Keywords: Driver behaviour, Machine Learning, Drowsiness,
INTRODUCTION
Nowadays, machine learning has been widely used for data analytics and decision making.Machine Learning (ML) is the subdivision of Artifical Intelligence (AI) which have the potential to learn autonomously, without complex training, based on interpretation and analysis of a defined data collection. They are three types of machine learning, which are the supervised learning, unsupervised learning, and reinforcement learning. Deep learning (DL) is the subdivision of machine learning.
The algorithm was inspired by how human learn from experience. So, it can improve the output result by repeating the learning process. In this project, the supervised learning which is Support Vector Machine (SVM) classifier with linear kernel was implemented to classify the behaviour of the driver wheter in drowsy or non-drowsys state.
Support Vector Machine was known as a learning technique that were used to solve regression and classification problems in various field. It can provide an accurate
afentoulis, 2015).When the data is linearly separable, that is, it can be isolated by a single line, the Linear Kernel is used. It is one of the kernels that should be used most frequently. It is often used when a specific data set has a large number of features. It is easier to train the SVM with a linear kernel than with any other kernel (Prateek Bajaj, 2018) Figure 1 shows the subdivision of Artificial Intelligence (Kanth, 2018).
Figure 23: Subdivision of Artificial intelligence (“Big data architectures - Azure Architecture Center | Microsoft Docs,” 2020).
Micro-sleep is a short period of precipitous onset sleep that stay for between second to 30 seconds. Aetiology Sleep deprivation, mental fatigue, sleep apnea, hypoxia, narcolepsy, and hypersomnia are associated with excessive daytime sleepiness and automatic behaviour (Segen, 2006). Micro-sleep is one of the biggest factors that leads to fatal road accidents. The latest microsleep-related fatal incident involved Dnars Skincare's founder, Faziani Rohban Ahmad, and her husband, Ahmad Shah Rizal Ibrahim, who drove the car and is believed on drowsy state while driving from their house at Tumpat, Kelantan to Pattani, Thailand. Melaka Putra Specialist Nephrology Hospital and Internal Medical Consultative Expert, Datuk Dr. S. Ravih said microsleep occurs without warning and is one of the causes of frequent road accidents (Ahmad Erwan Othman, 2019). The star online stated that Malaysia has the third highest death rate from road traffic accidents in Asia and Asian after Thailand and Vietnam (LUM, 2019). Furthermore, drivers need to have high consciousness of their surrounding and focus on their driving to reduce the risk of road accidents.
When the drivers in their car alone, the driver need to be monitored and be alerted
Artifial Intelligence Any technique that enables computers to mimic human intelligence. It includes
machine learning
Machine Learning A subset of AI that includes techniques that enable machines to improve at tasks
with experience. It includes deep learning
Deep Learning
A subset of machine learning based on neural networks that permit a machine
to train itself to perform a task
Further more, the previous driver behaviour monitoring, and alert system consume a longer time to analyse the data from the monitoring and the accuracy of the system is not high enough to be categorized as a reliable system to monitor the driver behaviour. It also does not provide the real-time monitoring that can reduces the time taken to analyse data. When the time taken to analyse the data is long, it will be too late to prevent the micro-sleep from happening. For this reason, a driver behaviour monitoring and alert system simulation that can provide the real-time monitoring and visualization of data is needed.
Lastly, the ability of machine learning algorithm in identifying the patterns and trends can ensure the accuracy of a detection system. This is because it gains experience practice and the performance and quality of the system will continue to increase (Data Flair, 2018). The detection of the driver behaviour can improve the accuracy and efficiency of the output result from the system by implementing machine learning which is the SVM classifier with linear kernel.
METHODOLOGY
The proposed methodology for this system that can provide real-time drowsiness detection based on user’s eyes and mouth will be discussed in this part. The simulation of this system was done using MATLAB simulator. Firstly, the camera will acquire the video of the driver and the video will need to be divided into non-identical frames.
From the unidentical frames, the face detection and skin segmentation process will be conducted. After that, it will undergo eyes monitoring and yawning detection process.
The result of the eyes monitoring, and yawning detection will be fed to the SVM classifier with linear kernel to detect whether the driver is in drowsy or non-drowsy condition. The further details for each process will be discussed on the next section.
Figure 2 shows a the overall workflow for this project in its development process.
Figure 24: The flowchart of the system
SUCCES
FAIL
FAIL
FAIL
A B
C
STAR
Camera acquires the video image of the driver.
Face detection using Viola-Jones algorithm
Skin Segmentation
(Converting the RBG image into the YCbCr domain)
NON-
ALARM =
END
DROW
Mouth region segmentation using K-means Sobel Edge
detection (Split eyes to right and
SVM with linear
SUCCES SUCCES
A B
C
FACE DETECTION AND SKIN SEGMENTATION
The importance of this part is to detect the face and mouth accurately. The first step of this process is capturing face from a frontal camera.
Figure 25: Face detection and skin segmentation process.
From Figure 3 to detect the face of the user or driver, the Viola-Jones algorithm were used (Viola, Jones, & Energy, 2004). After the face successfully detected, it will continue with the skin segmentation of the face. This were done by converting the RBG image into the YCbCr domain. By doing this, it will able to detect the skin region over the user’s face and remove the image that are not considered as the user’s face (Manu, 2017). RGB and YCbCr is a colour space. RGB colour space utilizes the colour variable Red-Green-Blue which utilizes certain elements to represent the light. Y is the colour's luma part. Component Luma is the colour's brightness or the intensity of the light. That component is more sensitive to the human eye. Cb and Cr are the system blue and red connected to the system chroma (Dias, 2017). The colour will remain distributed in the chrominance plane despite the different skin colour of a person (Manu, 2017). Figure 4 shows the illustration of face detection and skin segmentation.
Figure 26: Illustration of face detection and skin segmentation.
Detect the skin region of
the user Convert
image to RGB YCbCr domain Face
detectio n using Viola- Jones Algorith
m Frames
of image in RGB
EYE DETECTION
Figure 5 shows the eye detection process flow. By utilizing Viola-Jones algorithm, the eye positions were also determined. The eyes were then split into left eye and right eye using the Sobel edge detection. The pupil is finally identified. If the eyes were detected as open, the symptom is considered as non-drowsy condition, and the warning is off. Next, If the eyes are closed for 3 seconds, it will be recognized as a state of drowsy and the alarm will turn on. The Sobel detector transforms the picture in both horizontal and vertical directions with a filter which has a low, separable, and integer weighted function (Manu, 2017).
Figure 27: Eye detection process flow.
Next, the condition of the eyes was determined in each frame using the correlation coefficient template matching method. To detect the eyes’ precisely and to get the exact boundaries, the Sobel edge detection method is also used. The Sobel edge method starts from left side to the right side to find the eyes of the user. From this method, we can detect the eye separately. After the eyes were detected, they will be segmented from the image to generate an eye template of the user. By using this method, it will reduce the influence of light reflections. Figure 6 shows the eye template generation process (Manu, 2017).
Figure 28: Eye template generation process (Manu, 2017).
Eye detection using Viola- Jones Algorithm
Sobel edge detector
Eyes split to right and left
ete
Identification of the pupil
YAWNING DETECTION
Yawning tends to relate with sleepiness or drowsiness. Nearly anything that induces drowsiness, ranging from sleep to some drugs, will contribute to a yawning (“Yawning: Symptoms, Signs, Causes & Treatment,” n.d.). Figure 7 below shows the yawning detection process flow.
Figure 29: Yawning detection process flow.
Using Viola Jones, the mouth area also identified. Using correlation coefficient template matching, the mouth region will be segmented utilizing the use of K-means clustering and tracked after that. If the mouth is open, it is called a sign of non-drowsy.
If the mouth is closed it is recognized as drowsy condition (Manu, 2017).
Next, K-means clustering is a partitioning process. The function K-means partitions data into k clusters which are mutually exclusive and returns the cluster index to which each observation is assigned. K-means handles any point in the data as a subject with a position in space. K-means function seeks a partition in which objects inside each cluster are as near as possible to each other, and as far as possible from objects in those other clusters (“k-Means Clustering - MATLAB & Simulink,”
n.d.). The objective of this K-means clustering is to obtain minimum distance between classes. The function for minimum is shown in equation (1) and (2).
Equation (1)
Equation (2)
In equation (1) and (2) is pixel, is the centre of class j and are pixels belonging to class j. Classification of image pixels is based on the brightness intensity of the image. In this project, it uses the templates generated with value of K=2(Manu, 2017).
1) MOUTH DETECTION USING VIOLA-JONES ALGORITHM
2) MOUTH REGION SEGMENTATION USING K-
MEANS CLUSTERING.
RESULTS AND DISCUSSION
The final results from this project and the discussion are as follow:
DROWSINESS AND NON DROWSY DETECTION.
Figure 8 below shows the result for the non-drowsy state. When the eyes of the driver or one of the user’s eyes are opened, and the mouth of the driver is closed. It is considered as non-fatigue (non-drowsy) condition, and the alarm will be off. In the first box, it displays the real-time face detection of the users and the state of the user in green coloured text. The second box display the real-time location of the face and mouth area combined with the face segmentation.
Next, the third and the fourth box display the right eyes and left eye of the users respectively with the state of both eye whether in close and open state. Lastly, the fifth box displays the mouth area and it shows real-time process of detecting the yawning using the K-means clustering function. These box plots were done using the subplot function in MATLAB R2015a.
Figure 30: Result for non-drowsy state.
The result for the detection of drowsy state of the user is shown in figure 9. The explanation for each subplot is the same as the drowsy state results. When the eyes of the driver are closed, and the mouth of the driver is opened. It is considered as a fatigue condition, and the alarm will turn on to alert the driver or user with a beep sound if the user is in drowsy state for more than 3 seconds. The beep sound will stop if the user is classified as non-drowsy. For the drowsy state, there is a red coloured text displayed “non-fatigue”.
Figure 31: Result for drowsy state.
RESULTS FOR EACH PROCESS IN THE DESIGNED SYSTEM.
Table 9:Results for each step in the drowsiness detection process.
STEPS RESULTS PROCESS
1
Figure 10: Face detection.
Face detection using Viola- Jones algorithm.
2
Figure 11: Skin segmentation.
Skin segmentation by changing the RBG images to YCbCr
domain.
3
Figure 12: Eyes and mouth detection.
Eyes and mouth detection using Viola-Jones algorithm.
4
Figure 13: Sobel edge detection.
Sobel edge detection to split the eyes to right and left eye.
5 Figure 14: Mouth region segmentation.
Mouth region segmentation using K-means clustering.
In this project, a binary Support Vector Machine classifier with linear kernel was used for the classification of non-drowsy state and drowsy state. The first step in the process is to detect the drowsiness of the user is face detection was successful as shown in figure 10. The face region was located by displaying the red coloured line rectangle.
Next, the design system successfully detected the eye and mouth region by utilizing the Viola-Jones algorithm as shown in Figure 12. The eye and mouth regions were located by displaying the yellow coloured line rectangle at the mouth and eyes area. Figure 11 in Table 1 shows the result for the skin segmentation. We can see from the figure that the segmentation process removed the non-face image and detected all the face area of the user.
As mention in previous section, there was the utilization of the Sobel edge method to detect the eye area and then by using the same method, the eyes were split into right and left eye as shown in figure 13. The result for the mouth region detection and segmentation is shown in figure 14. The simulation of the system was done in different condition, which is in dim lighting, better lighting, long distance (60cm), and short distance (30cm). Table 2 below shows the test results in different condition.
The accuracy calculations were done based on the equations below:
Table 10:Test results for different condition with their accuracy.
CONDITION NO. OF TEST FALSE IDENTIFICATIO
N
ACCURACY
DIM LIGHT AREA 100 40 60.0 %
BRIGHT AREA 100 20 80.0 %
LONG DISTANCE (60 cm) 100 70 30.0 %
SHORT DISTANCE (30 cm)
100 30 70.0 %
The simulation that were done in dim light area and bright area, the webcam was placed at 30 cm away from the user. From table 2, we can get the highest accuracy with the value 80.0%, by setting up the simulation in bright area and the webcam is placed 30 cm away from the user. Next, the simulation in dim light area gives an accuracy of 60.0% and the simulation for a long distance in which the webcam was placed 60 cm away from the user, gives an accuracy of 30.0%.
Next, the simulation that were done in short distance in which the webcam was placed 30 cm away from the user gives out an accuracy of 70.0%. Figure 15 shows the bar chart of the percentage accuracy of the system versus different conditions in which the simulations were carried out.
Figure 15: Bar Chart of Percentage Accuracy versus Condition.
DIM LIGHT AREA BRIGHT AREA LONG DISTANCE SHORT DISTANCE
0 10 20 30 40 50 60 70 80
90 PERCENTAGE ACCURACY VERSUS CONDITION BAR CHART
PERCENTAGE ACCURACY (%)
CONDITION
PERCENTAGE ACCURACY
CONCLUSION
From this paper, we can conclude that by implementing the machine learning approach in the driver behavior monitor and alert (DBMA) system will produce a great system that is beneficial to the society especially to the driver itself. The development of this project will be useful in preventing and decreasing the amount of the road accidents that happen due to micro-sleep. Next, this project will be convenient for the user as it can provide the real-time monitoring of the driver behavior throughout the journey on the road by displaying the real-time drowsiness detection of the driver. The average accuracy for driver behaviour monitoring and alert system is 60.0% and the highest accuracy for this system is 80.0%. To increase the accuracy of the system, the simulation needs to be done in bright area and the distance between the webcam and the user of 30 cm.
The first thing that can be done in the future is by including the function that can calculate and display the number of user’s blinking over time. From that, we can determine the frequency of the blinking of the users and at what time the users have the highest number of blinking. Next, we can improve the system by developing the data dashboard that can display the real-time data collected from the system. A data dashboard is an information management system that visually monitors, analyzes and displays key performance indicators (KPIs), statistics, or key data points to monitor a company, organization, or particular process's safety. Lastly, we can use a higher- resolution webcam so that we can acquire a better video images or images. If we use a higher-resolution webcam, we can get a clearer image and it can increase the accuracy of the drowsiness detection and the classification of the user’s behaviour.
REFERENCES
Ahmad Erwan Othman. (2019). BERNAMA.com - A second of microsleep can spell disaster. Retrieved from http://www.bernama.com/en/news.php?id=1730534 Apostolidis-afentoulis, V. (2015). SVM Classification with Linear and RBF kernels
Konstantina-Ina Lioufi. ResearchGate, (July), 0–7.
https://doi.org/10.13140/RG.2.1.3351.4083
Big data architectures - Azure Architecture Center | Microsoft Docs. (2020). Retrieved September 30, 2020, from https://docs.microsoft.com/en- us/azure/architecture/data-guide/big-data/ai-overview
Data Flair. (2018). Advantages and Disadvantages of Machine Learning Language - DataFlair. Retrieved August 16, 2020, from Machine Learning Tutorials website:
https://data-flair.training/blogs/advantages-and-disadvantages-of-machine- learning/
Dias, D. (2017). What is YCbCr ? (Color Spaces) - Break the Loop - Medium. Retrieved
August 16, 2020, from Medium website:
https://medium.com/breaktheloop/what-is-ycbcr-964fde85eeb3
k-Means Clustering - MATLAB & Simulink. (n.d.). Retrieved August 23, 2020, from https://www.mathworks.com/help/stats/k-means-clustering.html
Kanth, U. (2018). The Difference Between Artificial Intelligence, Machine Learning, and Deep Learning. Retrieved August 16, 2020, from Medium website:
https://datacatchup.com/artificial-intelligence-machine-learning-and-deep- learning/
LUM, D. M. (2019). We have the third highest death rate from road accidents | The Star
Online. Retrieved from
https://www.thestar.com.my/lifestyle/health/2019/05/14/we-have-the-third- highest-death-rate-from-road-accidents
Manu, B. N. (2017). Facial features monitoring for real time drowsiness detection.
Proceedings of the 2016 12th International Conference on Innovations in Information
Technology, IIT 2016, (March).
https://doi.org/10.1109/INNOVATIONS.2016.7880030
Prateek Bajaj. (2018, June 20). Creating linear kernel SVM in Python - GeeksforGeeks.
Retrieved September 30, 2020, from https://www.geeksforgeeks.org/creating- linear-kernel-svm-in-python/
Segen, J. (2006). Concise Dictionary of Modern Medicine. In The McGraw-Hill Companies, Inc. https://doi.org/1850703213
Viola, P., Jones, M., & Energy, M. (2004). Robust Real-Time Face Detection Intro to Face Detection. International Journal of Computer Vision, 57(2), 137–154.
Yawning: Symptoms, Signs, Causes & Treatment. (n.d.). Retrieved August 17, 2020, from https://www.medicinenet.com/yawning/symptoms.htm