ALGORITHM USING CONVOLUTIONAL NEURAL NETWORK (CNN) A PROTOTYPE OF TRAFFIC LIGHT COLOUR DETECTION

(1)

International Journal of Technology Management and Information System (IJTMIS) eISSN: 2710-6268 [Vol. 4 No. 4 December 2022]

Journal website: http://myjms.mohe.gov.my/index.php/ijtmis

A PROTOTYPE OF TRAFFIC LIGHT COLOUR DETECTION USING CONVOLUTIONAL NEURAL NETWORK (CNN)

ALGORITHM

Gloria Jennis Tan^1*, Ahmad Farid Tahfizudin Mahadi², Tan Chi Wee³, Ngo Kea Leng⁴, Zeti Darleena Eri⁵ and Ung Ling Ling⁶

1 2 5 6 Faculty of Computer and Mathematical Sciences, Universiti Teknologi MARA Cawangan Terengganu,

Kuala Terengganu, MALAYSIA

3 Faculty of Computing and Information Technology, Tunku Abdul Rahman University College, Kuala Lumpur, MALAYSIA

4 Academy of Language Studies, Universiti Teknologi MARA, Cawangan Terengganu, Kuala Terengganu, MALAYSIA

5 Faculty of Computer and Mathematical Sciences, Universiti Teknologi MARA Cawangan Sabah, Kota Kinabalu, MALAYSIA

*Corresponding author: [email protected]

Article Information:

Article history:

Received date : 11 November 2022 Revised date : 16 December 2022 Accepted date : 3 December 2022 Published date : 15 December 2022 To cite this document:

Tan, G. J., Mahadi, A. F. T., Tan, C. W., Ngo, K. L., Eri, Z. D., & Ung, L. L.

(2022). A PROTOTYPE OF TRAFFIC LIGHT COLOUR DETECTION USING CONVOLUTIONAL NEURAL

NETWORK (CNN) ALGORITHM.

International Journal of Technology Management and Information System, 4(4), 1-14.

Abstract: Autonomous driving cars have become a trend in the vehicle industry. Numerous driver assistance systems (DAS) have been introduced to support these automatic cars. Among these DAS methods, traffic light colour detection plays a significant and important role to help colourblind individuals to recognize the traffic light colour. This project would bring a lot of benefits to the community. It proposed a method to detect traffic light colour using Convolutional Neural Network (CNN) algorithms. The dataset images were self-collected around Kuala Terengganu and Kuala Berang areas. The dataset undergone a labelling process where it would define the coordinates and the bounding boxes for each traffic light in the images. The researchers trained the dataset to use the Convolutional Neural Network algorithm to create the model. Next, the training performance and testing performance was evaluated. The dataset was split into two which are 80% for the training dataset and 20% for the

(2)

testing dataset. The detection model has been evaluated using the confusion matrix evaluation technique which would result in accuracy, precision, recall, F1-score and means average precision (mAP). As for this project, the highest accuracy for the algorithm’s model is 93.47%. The classification of the traffic light colour was done by counting the total number of red, yellow, or green pixels in the bounding box of the detection.

Keywords: Driver Assistance Systems (DAS), Traffic Light Colour Detection, Assistive Technology, Colourblind, Vehicle Industry.

1. Introduction

The automotive industry is increasing on a big scale due to the developments of technology in driver assistance system. The development of the driver assistance system needs traffic light colour detection because when the system can identify the traffic light colour, it can help the driver assistance system to react such as automatic braking when the traffic light is red. As stated by Stanford Law School study, at least 90% of all road traffic accidents are caused entirely or partially by human error (Smith, 2013). This shows that human error leads to an increased number of road accidents. So, this driver assistance system also helps reduce the number of roads accident because it can eliminate the human errors that might happen during driving. Most of these technology development systems use algorithms because they can give the computer a specific set of guidelines that allows it to do everything based on what the developer needs. Commonly, algorithms are widely used to perform computation, data processing, image processing and others.

Their objective is to propose a solution to a specific problem. In this project, Convolutional Neural Networks (CNN) would be used as an algorithm in order to classify the traffic light colour.

Convolutional Neural Networks (CNN) algorithm is a supervised deep learning algorithm which can be used in image recognition, image detection and recommender system. CNN is commonly used for image processing because it can detect and categorise specific characteristics in pictures (Gurucharan, 2020).

Next, the problem is also for the person that lacks colour vision (Al-Nabulsi et al., 2017). These colourblind people are unable to distinguish between various colours. There are a few types of colour vision problems that are usually faced by these colourblind individuals such as Protanopia which is red-green weak, Tritanopia which is blue-yellow weak and Monochromacy which is all they see is only grey colours. That means these kinds of individuals would face problems when they see the traffic light’s colours. This inability would let them sink into problems when identifying the colours. Thus, this project could help them in recognizing the traffic light colours correctly.

(3)

The next problem is the high usage of technology in the automotive industry that want to develop driver assistance systems such as self-driving cars or a car that can respond to external condition such as an auto brake when the driver did not notice the traffic light is red. (Kulkarn et al., 2018).

These automotive companies need to implement the traffic colour detection system in their vehicles’ technology for their system to detect the traffic light colours. Driver Assistance Systems that respond to internal and external safety systems are meant to eliminate human errors when driving different types of vehicles. It also employs modern technology to guide drivers while driving and can improve the drivers’ performance and safety. Hence, by developing this project, it can improve technology in the automotive industry which needs traffic light colour detection to detect the traffic light colour and may reduce human error that could lead to road accidents.

This project is focusing on traffic light colour detection by using Convolutional Neural Networks (CNN) algorithm. The main concern of the traffic light colour detection system is playing an important role in recognizing the traffic light colour for the automotive industry and colour-blind people.

2. Literature Review

Traffic light colour detection is useful in analysing the current traffic conditions on the road which could assist road users in navigating into the safer surrounding. By developing this traffic light colour detection, it could help colourblind people in recognizing traffic light colours correctly (Al- Nabulsi et al., 2017b). Other than that, due to the high technology in the automotive industry, it also helps in the development of driver assistance systems. (Sheikh et al., 2017). As an example, the cars could run the auto brake system when the traffic light colour turns red. This greatly helps to increase driver safety while driving. CNN algorithm is used to detect the traffic light colours in this project. Figure 1 shows the example of traffic light colour detection.

Figure 1: Example of Traffic Light Detection

(4)

2.1 Similar Works

The first similar work (Yeh et al., 2021) titled Traffic Light and Arrow Signal Recognition Based on a Unified Network. It uses CNN – You Only Look Once (YOLO) algorithm to recognize the traffic light by utilizing CNN. The problem of this study is due to the existence of several disturbing elements in outdoor situations, such as imperfect light shapes, dark light conditions, and blockage, detection accuracy is poor. The result from this study stated that the mean average precision (mAP) score for traffic light detection is 97%.

Next, the second similar work is done by (Ouyang et al., 2020) titled Deep CNN-Based Real-Time Traffic Light Detector for Self-Driving Vehicles. It uses the combination of the Convolutional Neural Networks (CNN) classifier model algorithm and the heuristic Region of Interest (ROI) to detect and identify all possible traffic lights for the autonomous vehicle platform. The problem of this study is the position of cameras, changes in external conditions and the distance of an object could lead to a major impact on traditional computer vision-based traffic light detectors to detect the traffic lights. The result from this study stated that the accuracy of detection between 0-10- meter distance is in the range of 95%-98%. The accuracy of the detection might decrease due to the increase in distance.

Next, the third similar work done by (Any Gupta & Ayesha Choudhary, 2019) titled A Framework for Traffic Light Detection and Recognition using Deep Learning and Grassmann Manifolds. It uses a Faster R-CNN object detection algorithm to detect and recognize traffic lights with a camera-based framework (real-time). The problem of this study is because of the tiny size and colours (red, yellow, and green) of the traffic lights, they could be identical to the surroundings, streetlights, and environmental conditions, and with some other factors, the autonomous detection and recognition of traffic lights are difficult to detect the traffic light colour. The result from this study stated that the highest accuracy of detection is 98.8% and the classification accuracy is 97.8%.

Next, the fourth similar work done by (Ghennioui et al., 2019) titled Real-Time Traffic Light Detection and Classification using Deep Learning. It uses Faster Region-based Convolutional Networks (R-CNN), Region-Based Fully Convolutional Networks (R-FCN), and Single Shot MultiBox (SSD) algorithms to improve traffic light detection and classification with higher accuracy while reducing the detection and recognition time. This study encountered various challenges, traffic light detection and recognition (TLDR) were a difficult task. The pretty small size of traffic lights, viewpoint distances, environmental conditions, and misunderstandings with other elements in the external environment such as streetlights, house exterior lights, and other exteriors were some of the major problems. The result of this study stated that Faster R-CNN gave a better mean average precision (mAP) score which was 76.3% on the test set compared to the other models but the test time was longer compared to R-FCN.

Lastly, the fifth similar work done by (Kulkarn et al., 2018) titled Traffic Light Detection and Recognition for Self-Driving Cars Using Deep Learning. It used Faster Regio-based Convolutional Network (R-CNN) algorithm to detect and recognize traffic lights deep neural network-based model. The problem of this study was that autonomous vehicles face several difficulties such as detecting traffic lights, signs, lane lines, people, cyclists, road users, and others that might be

(5)

confusing for the system leading to lower accuracy. The result of this study stated that the highest accuracy of detection arrow and colour recognition was up to 99%.

3. Implementation

The development phase is the most important part of this system because it would execute all the input in order to generate the output can lead to higher accuracy. This project would use the Phyton programming language to carry out the design and implementation of the techniques involved which are feature extraction and classification tasks in Convolutional Neural Networks (CNN).

The training data and input data would be operated in this phase.

3.1 Proposed System

The system architecture of the Traffic Light Colour Detection System consists of the user interface, image pre-processing, and feature extraction and detection by Convolutional Neural Networks (CNN) as shown in Figure 2.

Figure 2: System Architecture

Based on Figure 2, the raw dataset would be cleaned in order to keep the high-quality images in the pre-processing step. Blurred images would lead to lower accuracy for the model. Then, the image would undergo an image labelling process to create a bounding box and discover the coordinates for each traffic light in the images. Next, the cleaned and labelled dataset would be inserted into the training phase where it would train the Convolutional Neural Network model. The training and testing dataset would be loaded into this CNN model. Then, it moved to the feature

(6)

extraction phase by CNN where it would extract features from the images of the traffic light by Convolutional Layer. In the detection layer, it would detect the coordinates and create the bounding boxes of the traffic lights. Lastly, the trained CNN model would be saved for the detection task using the input image by the user. For the detection using input image by the user, after the user chooses the input image in the user interface, it would undergo a detection process using the CNN model that had been saved earlier from the training in order to detect the traffic lights. Next, the system would classify the traffic lights either it is red, yellow, or green light. At the end, in the classification task, the user would receive the output which is the result of the detection using the CNN model and the system would display the result in the user interface whether the result of traffic light colour detection is red, yellow, or green light.

3.2 Proposed Approaches

CNN algorithm is the proposed technique that would be used in this project, as well as one of the most useful image detection algorithms often applied in image processing project development. It is also a supervised deep learning algorithm which could be applied to image recognition, detection, recommender system and data analysis. In this section, the definition and overview of the CNN algorithm, feature extraction, classification, the advantages and disadvantages of the CNN algorithm, and the implementation of the Convolutional Neural Networks (CNN) algorithm in various problems would be discussed.

Firstly, it begins with the input image from the user which is the traffic light image. Then, the input image would undergo the pre-processing phase. It would perform the resizing of the image to decrease the total number of pixels and reduce the size of the image.

Then it would load the CNN model from training and testing. For this project, 100 images would be used for the training and testing which are 80% of the images for the training and the remaining 20% for the testing. The CNN algorithm would be used in order to train the CNN model for traffic light detection. Next, the input image by the user would move to the feature extraction phase by CNN in Convolution Layer. In this phase, it would extract the features from the input image by the user. Then in the detection layer, it would calculate the coordinates of the traffic light and create the bounding box of the predicted position of the traffic light. As result, it would recognize the position of the traffic light.

Result from the traffic light detection would be used for the traffic light colour classification. In order to classify the traffic light colour, counting a total number of red, orange, and green pixels methods would be used. It would count the number of pixels from the bounding box. For example, if the total number of green pixels in the traffic light’s bounding box is more than 1000 pixels, it would detect the traffic light as green. Lastly, the result of the traffic light colour detection and classification would be display in the user interface.

(7)

3.3 Dataset

In development of this system, the dataset used is primary data because all the image used is self- collected. The traffic light images dataset was collected using a smartphone around Kuala Terengganu and Kuala Berang areas. The dataset contains 120 images of traffic light. 100 images used for training and testing phase to generate the weight and evaluate the performance of Convolutional Neural Network model. The remaining 20 images is used for the detector as input image in order to detect the traffic light colour. Table 1 illustrates the sample of images in the training and testing dataset.

3.4 Design Phase

Traffic Light Colour Detection System has been developed by using the Convolutional Neural Network model. Tkinter was used to create the prototype interface. Generally, Tkinter is a graphical user interface (GUI) library for Python. When combining Python with Tkinter can help to create a quick and simple GUI for the programs. Figure 4 shows the user interface for Traffic Light Colour Detection System using CNN. On the interface, there are six buttons which are browse image, run detection, display result, sound, about and exit button. Other than that, the prototype interface also has two figures which would be used to display the input image and the output image. The accuracy and the result of the detection would be shown in the text box at the left bottom of the interface.

(8)

Table 1: Sample Images in Training and Testing Dataset

Dataset Name

(Training) Images (Training) Dataset Name

(Testing) Images (Testing)

79.jpg 3.jpg

27.jpg 51.jpg

11.jpg 64.jpg

Figure 4: Main Page of Prototype Interface

(9)

Figure 5 below shows the prototype interface after the user has completed the detection of the traffic light colour. The user needs to click on the “Run Detection” button in order to let the system program detect the traffic light colour from the input image. Result of the detection displayed after the user clicks on the “Display Result” button. The output image from the detection displayed on the right side of the prototype interface. Result of the detection as well as its accuracy shown in the text box at the left bottom of the interface. In figure below, result of the detection from the input image stated that the result is “Stop” because the traffic light is red. The accuracy of the detection is 99%. If the user wants to use the sound feature for this system, the user needs to click on the “Sound” button to play the sound of the result from detection. Lastly, in order to quit the program, the user needs to click on the “Exit” button.

Figure 5: Prototype Interface with Result and Sound Feature

3.5 Evaluation of Results

Methods used to evaluate the algorithm’s model performance in this project are finding the accuracy, precision, recall, F1- score and mean average precision. A confusion matrix is used to visualize the performance of our model.

The dataset is split into two which is the training and testing dataset. This training and testing dataset split is a method for training the model and identifying the algorithm's performance. The process involves splitting a dataset into two subgroups. The first subgroup is used to construct the model which is also referred to as the training dataset. And the second subgroup which is the testing dataset is used as a validation set to validate the model performance during training. As for this project, 80% of the dataset is used for training and contains 80 images. The remaining 20% is used for the testing dataset which contains 20 images.

(10)

4. Results and Discussion

Tables 2 and Table 3 below illustrate the confusion matrix for the training model performance for 1000 and 2000 iterations. True Positive (TP) represents the number of positive predictions made by the classifier. True Negative (TN) represents how many predictions the classifier made where it correctly identified the negative class as negative. Next, False Positive (FP) show the number of predictions when the classifier falsely predicted the negative class as positive. Lastly, False Negative (FN) refers to the number of predictions in which the classifier wrongly predicts the positive class as negative.

Table 2: Confusion Matrix for 1000 Iterations

n = 46 Actual Positive Actual Negative Predicted

Positive TP = 26 FP = 2 Precision

93%

Predicted

Negative FN = 1 TN = 17 mAP

99.16%

Recall 96%

F1-score 95%

Accuracy 93.47%

Table 3: Confusion Matrix for 2000 iterations

n = 37 Actual Positive Actual Negative Predicted

Positive TP = 26 FP = 3 Precision

90%

Predicted

Negative FN = 1 TN = 7 mAP

99.49%

Recall 96%

F1-score 93%

Accuracy 89.19%

By using the formula of accuracy in Equation 1, accuracy of the algorithm’s model is 93.47% from the evaluation of 1000 iterations. The accuracy of 2000 iterations are 89.19%. This means not all images are correctly detected by the classifier because the training dataset only contains 80 images.

More images are needed in order to increase the accuracy of the algorithm.

(11)

Equation 1: Accuracy Formula

Table 4 below shows the performance of the training result. There are 2000 iterations set to train the algorithm’s model. In this project, the learning rate has been set to 0.001. A high learning rate would enable the model to learn more quickly. While the lower learning rate would make the training take a much longer time, the model may learn more ideal. The initial learning rate of training at 0 iterations is 0.000 but starting from 1000 iterations until 2000 iterations, the learning rate starts to maintain at 0.001. The precisions from 1000 until 2000 iterations are consistent and managed to maintain the percentage between 83% to 96%. The highest recall is at the iterations of 1900 which is 1.00. That means the model can classify the positive samples correctly. F1-score is used to analyse the binary classification methods that classify samples as either "positive" or

"negative”. In the training result, the highest F1-score that has been achieved is 0.96. Lastly, the mean average precision or also called mAP is used to calculate a score by comparing the detected box to the ground-truth bounding box. The highest mAP obtained from the training is 100% which means the model accurately detects the traffic light. The higher the score of the mAP, the higher the accuracy of the model in its detection.

Table 5 below shows the performance of the testing result. All 4 weights which are 1000 weight, 2000 weight, best weight and the final weight that has been generated from the training is used to evaluate each of their performance. The final weight and 2000 weight are the same since the number of iterations for our training is set to 2000 iterations. If comparing these 4 weights, the highest precision is 92% which was obtained at 2000 weight and final weight. As for the recall, the highest recall that obtained at 2000 weight and the final weight which is 0.59. Some of the samples are not being classified correctly. For the F-1 score, the highest that has been obtained from the best weight is 0.75. Lastly, for the mean average precision, the highest percentage is 74.28% means that the model is good enough to detect the traffic light.

(12)

Table 4: Performance of Training Dataset Iteration Initial Learning

Rate Precision Recall F1-score Mean Average Precision

0 0.000 - - - -

1000 0.001 0.93 0.96 0.95 99.16%

1100 0.001 0.83 0.89 0.86 87.28%

1200 0.001 0.96 9.96 0.96 99.49%

1300 0.001 0.93 0.96 0.95 95.50%

1400 0.001 0.89 0.93 0.91 98.68%

1500 0.001 0.96 0.96 0.96 99.61%

1600 0.001 0.96 0.93 0.94 95.77%

1700 0.001 0.93 0.96 0.95 94.51%

1800 0.001 0.96 0.96 0.96 99.74%

1900 0.001 0.93 1.00 0.96 100.0%

2000 0.001 0.90 0.96 0.93 99.49%

Table 5: Performance of Testing Dataset

Weight (model) Precision Recall F1-score Mean Average Precision

1000.weight 0.85 0.54 0.66 63.23%

2000.weight 0.92 0.59 0.72 71.80%

best weight 0.87 0.65 0.75 74.28%

Final.weight 0.92 0.59 0.72 71.80%

The evaluation of the Convolutional Neural Network algorithm in traffic light detection includes the precision, recall, F1-score and mean average precision of the model. From the training and testing of the CNN model, we can see that this model can detect the traffic light accurately. Other than that, the result of the evaluation from our training and testing would determine whether our system prototype can correctly detect the traffic light colour or not. As for the weight that would be used in the system prototype, the best weight was chosen because it has the highest percentage of the mean average precision (mAP) which is 74.28%. The mAP calculates a score by comparing the detected box to the ground-truth bounding box which means the model’s detection is more accurate when the mAP score is higher.

(13)

5. Conclusion

This project aims to develop a system to detect the traffic light colour from images using image processing techniques. Convolutional Neural Network (CNN) algorithm is chosen because it has higher accuracy and detects the traffic light colour from the images correctly. This system is developed to help the colour blindness individuals to identify the traffic light colour. These colourblind individuals are unable to differentiate colours they struggle to recognise the traffic light colour. For example, they might speed their vehicles when the traffic light colour is red and stop because they were confused about the traffic light colour. This would lead to an increase in road accidents. It also contributes to the automotive industry in the development of Autonomous Driving Cars or Driver Assistance Systems.

References

Smith, B. W. (2013, December 18). Human error as a cause of vehicle crashes | Center for Internet and Society. http://cyberlaw.stanford.edu/blog/2013/12/human-error-cause-vehicle-crashes Gurucharan, M. (2020). Basic CNN Architecture: Explaining 5 Layers of Convolutional Neural

Network | upGrad blog. https://www.upgrad.com/blog/basic-cnn-architecture/

Al-Nabulsi, J., Mesleh, A., & Yunis, A. (2017a). Traffic light detection for colorblind individuals.

2017 IEEE Jordan Conference on Applied Electrical Engineering and Computing Technologies, AEECT 2017, 2018-January, 1–6. https://doi.org/10.1109/AEECT.2017.

8257737

Al-Nabulsi, J., Mesleh, A., & Yunis, A. (2017b). Traffic light detection for colorblind individuals.

2017 IEEE Jordan Conference on Applied Electrical Engineering and Computing Technologies, AEECT 2017, 2018-January, 1–6. https://doi.org/10.1109/AEECT.2017.

8257737

Kulkarn, R., Dhavalikar, S., & Sonal Bangar, S. (2018). Traffic Light Detection and Recognition for Self Driving Cars using Deep Learning. IEEE Computer Society. https://doi.org/10.1109/

TPAMI.2015.2437384

Sheikh, M. A. A., Kole, A., & Maity, T. (2017). Traffic sign detection and classification using colour feature and neural network. 2016 International Conference on Intelligent Control, Power and Instrumentation, ICICPI 2016, 307–311. https://doi.org/10.1109/ICICPI.2016.

7859723

Any Gupta and Ayesha Choudhary. (2019). A Framework for Traffic Light Detection and Recognition using Deep Learning and Grassmann Manifolds.

Ghennioui, H., el Kamili, M., Berrada, I., IEEE Communications Society, Jāmiʻat Sīdī Muḥammad ibn ʻAbd Allāh, & Institute of Electrical and Electronics Engineers. (2019). Real Time Traffic Light Detection and Classification using Deep Learning.

Kulkarn, R., Dhavalikar, S., & Sonal Bangar, S. (2018). Traffic Light Detection and Recognition for Self Driving Cars using Deep Learning. IEEE Computer Society.

https://doi.org/10.1109/TPAMI.2015.2437384

Ouyang, Z., Niu, J., Liu, Y., & Guizani, M. (2020). Deep CNN-Based real-time traffic light detector for self-driving vehicles. IEEE Transactions on Mobile Computing, 19(2), 300–313.

https://doi.org/10.1109/TMC.2019.2892451

(14)

Yeh, T. W., Lin, H. Y., & Chang, C. C. (2021). Traffic light and arrow signal recognition based on a unified network. Applied Sciences (Switzerland), 11(17). https://doi.org/10.3390/

app11178066