View of AN ANALYTICAL RESEARCH BASED ON SPEED TRAFFIC SIGN DETECTION UTILIZING LIGHTWEIGHT NETWORK

(1)

AN ANALYTICAL RESEARCH BASED ON SPEED TRAFFIC SIGN DETECTION UTILIZING LIGHTWEIGHT NETWORK

Amitesh Kumar Jha

Asst. Prof., CSIT Department, G.G.V. Bilaspur, C. G., India

Abstract - Vision-based traffic sign identification assumes a critical part in savvy transportation frameworks. For traffic sign detection, numerous deep learning-based methods have recently been proposed and demonstrated superior performance to conventional methods. However, the performance of deep learning-based methods for small traffic sign detection is still limited due to the size of traffic signs in traffic scene images and challenging driving conditions. Additionally, the current state-of-the-art approaches to traffic sign detection still have a slow inference speed. A deep learning-based strategy is proposed in this paper to boost the accuracy of small traffic sign detection in driving environments. Initial, a lightweight and productive design is taken on as the base organization to resolve the issue of the surmising speed.

1. INTRODUCTION

Intelligent transportation systems, such as an advanced driver assistance system and an automated driving system, rely heavily on vision-based traffic sign recognition. Typically, there are two stages in a traffic sign recognition system:

recognition and detection of traffic signs.

While traffic sign recognition classifies each traffic sign into a corresponding class, traffic sign detection uses images captured by a camera to precisely locate traffic sign regions. A recognition stage takes as inputs traffic signs that have been detected by a detection stage. As a result, the system's overall accuracy is significantly impacted by the accuracy of traffic sign detection. Traffic sign detection has been the subject of numerous proposals. To detect traffic signs in an image, traditional methods typically rely on hand-crafted features like color, texture, edge, and other low-level features. Traditional traffic sign detection methods performed poorly in the driving environment because of the variety of traffic sign appearances, obstruction by other objects, and effects of lighting conditions.

Many deep learning-based approaches for traffic sign detection have recently been proposed and demonstrated superior performance in comparison to conventional methods. Classifiers are

used to identify traffic signs and background classes in traffic sign detection techniques based on deep learning. First, traffic sign candidates are created. Although deep learning-based techniques for detecting traffic signs performed well in challenging driving environments, their performance in challenging conditions remains low, as discussed below:

(i) In traffic scene images, the size of traffic signs is quite small, making small traffic sign detection much more difficult than large traffic sign detection. The largest traffic sign is the focus of the most recent traffic sign detection techniques. As a result, these techniques perform poorly when it comes to detecting small traffic signs.

(ii) In driving environments where automobiles are unlikely to be outfitted with high-end hardware components, the inference speed of the traffic sign detection method is a major concern. High-end systems are where the most recent techniques for detecting traffic signs are implemented. As a result, it is necessary to develop a framework that can detect traffic signs more quickly in driving environments.

(2)

2. RELATED WORK

2.1. Traffic Sign Detection

There are two types of traffic sign detection using vision: deep learning- based method and traditional method. To detect traffic signs, traditional methods typically rely on hand-crafted characteristics like color and shape.

Bahlmann and co.2] suggested classifying traffic signs with Bayesian generative modeling and a set of Haar wavelet features obtained from Ada Boost training. Salti and co.3] instead of using sliding window detection, they suggested using interest region extraction.

Additionally, the SVM classifier is utilized for classification, with the histogram of oriented gradients in the regions of interest serving as the input feature. In the authors detected traffic signs by employing an RGB-based color thresholding method and a circle detection algorithm. A support vector machine classification framework employs a collection of features, such as a histogram of oriented gradients, local binary patterns, and Gabor features, for traffic sign recognition. Timofte and co.5]

suggested combining 2D and 3D methods to enhance traffic sign recognition and detection outcomes. A method for localizing traffic sign candidates was suggested in [6]. In an iterative optimization strategy, color and shape priors are used to precisely classify traffic signs as foreground objects. In the authors suggested a method with three stages of operation: image detection, recognition, and preprocessing. The proposed system demonstrated that combining a support vector machine classifier with RGB color segmentation and shape matching produces promising results. In many challenging driving conditions, manual features like histograms of oriented gradients and enhancement information of typical color or geometric shape frequently fail. As a result, traditional approaches to detecting traffic signs performed poorly.

Many approaches to traffic sign detection that are based on deep convolutional neural networks (CNN) have recently been proposed and demonstrated superior performance to those of more conventional approaches. Wu and co.10]

suggested using support vector machines to first convert the original image into a grayscale image. For the purpose of recognizing and recognizing traffic signs, a convolutional neural network with fixed and learnable layers was then utilized. A novel framework with two deep learning components, fully convolutional network- guided traffic sign proposals and deep CNN for object classification, was proposed by the authors. Zhu proposed a fully convolutional network that simultaneously performs classification and detection. Liu proposed the multi scale region-based convolutional neural network, which constructs the fused feature map by concatenating the features of the shallow layer with those of the deeper convolution layers through the use of a multi scale deconvolution operation.

A detector for detecting traffic signs in driving environments that is based on faster R-convolutional neural networks and the structure of Mobile Net was designed and implemented. Additionally, information about color and shape has been used to improve the localizations of small traffic signs. Yang and others created a novel detection module built on a color probability model and an oriented gradient color histogram based on traffic sign proposal extraction and classification. The detected signs are then further categorized into their subclasses within each super class using a convolutional neural network. Li and proposed a three-stage model: channel wise coarse element extraction is first embraced to create coarse component maps with much data misfortune, channel wise various leveled include refinement is then used to refine progressive highlights, and progressive element map combination is utilized at the last stage to intertwine various leveled

(3)

include guides to produce the last traffic sign saliency map. According to the color feature of the traffic signs, the authors of introduced an attention network to Faster R-CNN for the purpose of locating potential regions of interest and roughly classifying them into three categories. The final region proposals are then generated by the fine region proposal network using a set of anchors for each feature map location.

3. PROPOSED FRAMEWORK

The proposed method's overall structure is depicted in Figure 1. Convolution feature maps are first generated by the ESPNetv2-based base network. A deconvolution module is used to combine an output feature map at layer 2 with an output feature map at layer 3 to produce an enhanced feature map, which improves the proposed framework's ability to detect small traffic signs. The enhanced convolution feature map and the highest- level convolution feature map are then used to generate proposals using two improved region proposal networks. A 11 convolution layer reduces the number of

parameters in the subsequent convolutional layers of the improved region proposal network, and a 33 dilated convolution expands the receptive field, enhancing the detection accuracy and inference speed of the proposal generation stage. A region of interest pooling layer is used in the detection network to convert proposals' sizes into fixed-size feature maps, and fully connected layers are used to classify proposals and regress their bounding boxes. The proposed method is described in detail in the following sections.

3.1. The Base Network

The accuracy of traffic sign detection is the primary focus of most deep learning- based methods. In the driving environment, in addition to the accuracy of the detection, the inference speed is a major concern. In addition, high-end graphic cards that are as powerful as those used in research environments are unlikely to be found in automobiles used for driving. Consequently, the construction of a

Figure 1 The overall framework of the proposed approach.

(4)

Table 1 Performance comparison of different efficient networks on the Image Net validation set.

Network # parameters FLOPs Top-1 accuracy

MobileNetv1 3.20M 375M 65.5

MobileNetv2 4.50M 400M 83.5

ShuffleNetv1 6.40M 396M 75.3

ESPNetv2 6.56M 288M 77.3

faster network for detecting traffic signs while driving. Liu et al., in showed that the base convolution layers consume about 80% of the forward time. As a result, the framework's inference speed can be significantly improved by employing a faster base network. The ESPNetv2 network is a fast and effective network that learns representations from a large effective receptive field by employing depth wise dilated separable convolutions rather than depth wise separable convolutions. On the Image Net 1000-way classification dataset, a performance comparison between ESPNetv2 and modern efficient networks is presented in Table 1. ESPNetv2 has the best performance with the fewest computational budgets (computational budget = 284 million FLOPs), as shown in 1.In addition, ESPNetv2 is more accurate while sharing a number of parameters with ShuffleNetv1 and MobileNetv2. For quick and effective traffic sign detection, this paper uses ESPNetv2 architecture as its base network.

4. CONCLUSIONS

A deep learning-based framework for quick and effective traffic sign detection is proposed in this paper. The ESPNetv2 network is used as the based network in the proposed framework to speed up inference. An enhanced feature map with more representation capacity is created using a deconvolution module. An improved region proposal network is also intended to improve the performance of a proposal generation stage by incorporating a 11 convolution layer to reduce the number of parameters in subsequent convolutional layers and a 33 dilated convolution to expand the

receptive field. The GTSDB dataset and the TT-100K dataset, two widely used datasets, are used in the experiments to evaluate the effectiveness of each enhanced module and the framework as a whole. These data sets' experimental results demonstrate that the proposed framework meets the real-time requirements of an advanced driver assistant system and improves traffic sign detection performance in challenging driving conditions.

REFERENCES

1. C. Liu, S. Li, F. Chang, and Y. Wang,

“Machine vision based traffic sign detection methods: review, analyses and perspectives,”

IEEE Access, vol. 7, pp. 86578–86596, 2019.

2. C. Bahlmann, Y. Zhu, V. Ramesh, M.

Pellkofer, and T. Koehler, “A system for traffic sign detection, tracking, and recognition using color, shape, and motion information,”

in IEEE Proceedings. Intelligent Vehicles Symposium, 2005, pp. 255–260, Las Vegas, NV, USA, 2005.

3. S. Salti, A. Petrelli, F. Tombari, N. Fioraio, and L. Di Stefano, “Traffic sign detection via interest region extraction,” Pattern Recognition, vol. 48, no. 4, pp. 1039–1049, 2015.

4. S. K. Berkaya, H. Gunduz, O. Ozsen, C.

Akinlar, and S. Gunal, “On circular traffic sign detection and recognition,” Expert Systems with Applications, vol. 48, pp. 67–

75, 2016.

5. R. Timofte, K. Zimmermann, and L. V. Gool,

“Multi-view traffic sign detection, recognition, and 3D localisation,” in 2009 Workshop on Applications of Computer Vision (WACV), pp.

1–8, Snowbird, UT, USA, December 2009.

6. Z. Zhu, J. Lu, R. R. Martin, and S. Hu, “An optimization approach for localization refinement of candidate traffic signs,” IEEE Transactions on Intelligent Transportation Systems, vol. 18, no. 11, pp. 3006–3016, 2017.

7. S. B. Wali, M. A. Hannan, A. Hussain, and S.

A. Samad, “An automatic traffic sign detection and recognition system based on colour segmentation, shape matching, and

(5)

SVM,” Mathematical Problems in Engineering, vol. 2015, Article ID 250461, 11 pages, 2015.

8. Z. Liu, J. Du, F. Tian, and J. Wen, “MR-CNN:

a multi-scale region-based convolutional neural network for small traffic sign recognition,” IEEE Access, vol. 7, pp. 57120–

57128, 2019.

9. Z. Zhu, D. Liang, S. Zhang, X. Huang, B. Li, and S. Hu, “Traffic- sign detection and classification in the wild,” in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2110–2118, Las Vegas, NV, June 2016.

10. Y. Wu, Y. Liu, J. Li, H. Liu, and X. Hu,

“Traffic sign detection based on convolutional neural networks,” in The 2013 International Joint Conference on Neural Networks (IJCNN), pp. 1–7, Dallas, TX, 2013.

11. Y. Zhu, C. Zhang, D. Zhou, X. Wang, X. Bai, and W. Liu, “Traffic sign detection and recognition using fully convolutional network guided proposals,” Neurocomputing, vol. 214, pp. 758–766, 2016.

12. J. Li and Z. Wang, “Real-time traffic sign recognition based on efficient CNNs in the wild,” IEEE Transactions on Intelligent Transportation Systems, vol. 20, no. 3, pp.

975–984, 2019.

13. Y. Yang, H. Luo, H. Xu, and F. Wu, “Towards real-time traffic sign detection and classification,” IEEE Transactions on Intelligent Transportation Systems, vol. 17, no. 7, pp. 2022–2031, 2016.

14. C. Li, Z. Chen, Q. M. J. Wu, and C. Liu, “Deep saliency with channel-wise hierarchical feature responses for traffic sign detection,”

IEEE Transactions on Intelligent Transportation Systems, vol. 20, no. 7, pp.

2497–2509, 2019.

15. T. Yang, X. Long, A. K. Sangaiah, Z. Zheng, and C. Tong, “Deep detection network for real-life traffic sign in vehicular networks,”

Computer Networks, vol. 136, pp. 95–104, 2018.

16. S. Mehta, M. Rastegari, L. Shapiro, and H.

Hajishirzi, “Espnetv2: a light-weight, power efficient, and general purpose convolutional neural network,” in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9190–9200, Long Beach, CA, USA, June 2019.

17. S. Ren, K. He, R. Girshick, and J. Sun,

“Faster R-CNN: towards real-time object detection with region proposal networks,”

IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 6, pp. 1137–

1149, 2017.

18. J. Zhang, Z. Xie, J. Sun, X. Zou, and J.

Wang, “A cascaded RCNN with multiscale attention and imbalanced samples for traffic sign detection,” IEEE Access, vol. 8, pp.

29742–29754, 2020.

19. J. Zhang, W. Wang, C. Lu, J. Wang, and A. K.

Sangaiah, “Lightweight deep network for traffic sign classification,” Annals of Telecommunications, 2019.

20. H. Zhang, K. Wang, Y. Tian, C. Gou, and F.-Y.

Wang, “MFRCNN: incorporating multi-scale features and global information for traffic object detection,” IEEE Transactions on Vehicular Technology, vol. 67, no. 9, pp.

8019–8030, 2018.