Automated Dhaka City Vehicle Detection for Traffic Flow Analysis Using Deep learning.

This thesis titled “Automated Dhaka city vehicle detection for traffic flow analysis using deep learning.” submitted by Minhajul Islam, ID Department of Software Engineering, Daffodil International University, has been accepted as sufficient for the partial fulfillment of the requirements for the degree of B.Sc. Firstly, I express my gratitude to Almighty Allah, who has given me the opportunity to complete this dissertation. I express my sincere gratitude to the entire Software Engineering Department of Daffodil International University for providing me with good education and knowledge.

The knowledge I learned from the classes in our bachelor's degree in software engineering was essential for this thesis. The width and height of the bounding box in front of an anchor marked as Pw and Ph. If any object in the top left corner grid cell of the given image (Cx, Cy) and the following coordinates (tx, ty, tw and ste) for each grid cell.

The width and height of the predictive bounding boxes (bw, bh) can be obtained using an exponential function, e.g. YOLOv5s backbone architecture with all components. During the training period, the training image data is input into the YOLOv5s model by data increment and resizing. There are many ways to prevent traffic jams from spreading, and one of the most effective is to detect the vehicle.

The present study provides a summary of the state-of-the-art vehicle identification techniques, thus classified according to motion and aesthetic techniques, starting with frame difference and background subtraction and continuing with feature extraction, a more complicated model in comparative analysis.

INTRODUCTION

Background
Research Questions
Research Objectives
Thesis Organization

These versions use state-of-the-art object detection methods that have increased accuracy and acceptability. How can I achieve good accuracy for vehicle image detection tasks when the dataset is highly annotated. How can we achieve high accuracy for vehicle image segmentation tasks when we have limited computing resources?

My research goal was to develop a model that can detect the vehicle based on a set of PoribohonBD vehicle images, and if a vehicle is present, our model also indicates the location of the detected vehicle. Finally, the vehicle is detected and the accuracy with which types of vehicles can be detected is indicated.

LITERATURE REVIEW

Data Set & Data Processing

Accordingly, the dataset images are categorized into three groups, namely i) Train, ii) Test and iii) Validation. In the PoribohonBD dataset, 70% of image data is used for training purposes, 20% of images are used in testing, and 10% of images are used in validation over 9058 images.

METHODOLOGY

The detection principle of YOLOv5s

If any object located in the upper left corner grid cell for that image data (cx, cy) and bounding boxes height and width is (ph, pw), then the corresponding projection is fig.5. The width and height of the predictive bounding boxes (bw, bh) can be obtained using an exponential function e.g. If any previous bounding box overlaps a ground truth more than other bounding boxes, then this confidence score should be 1.

The bounding boxes overlapped the ground truth at this stage, but did not get the best bounding box before. After predicting bounding boxes, each box can predict classes using multilevel classification. And using Non-Maximum Suppression (NMS) to reduce unnecessary prediction for the best match in the final detection.

Network Architecture of YOLOv5s

Backbone
Head

In CBL, residual units (RES) are the primary element and are used to deepen the network architecture. Being the basic component of CBL, it realized the direct superimposition of tensors to add layers. In the Neck structure of YOLOv5s, the CSP2 structure designed by CSPnet is used to strengthen the ability of network function integration [35].

The header is the final component of YOLOv5's network design and is also known as a predictor. Head estimates class or object size based on neck features based on input image size and boxes (large, medium, small). The target rectangular area should be less than 32 pixels * 32 pixels to detect small objects.

In the YOLOv5s network architecture, the bounding box regression loss function and intersection through union (IOU). Here, Bgt represents the ground truth and on the other hand, B represents the predicted bounding box. In this study, BBgt shows the intersection of B and Bgt, and BBgt shows the merging of the Bgt band, which is clearly visible.

In non-overlapping cases, the predicted bounding box will be moved forward to the target box due to the penalty term. In GIoU, there are several limitations despite vanishing gradient problems for non-overlapping cases [37]. Then the predicted bounding box in the YOLOv5s model, this information can be obtained based on anchor boxes.

Then, to perform the training epoch, calculate the loss between the predicted bounding boxes and the ground truth. At this stage, the detection process obtained from the YOLOv5s model can be the first expected bounding box to be reliable, and then the final detection results can be obtained by Non-Maximum Suppression (NMS) or its alternative, which is used to reduce irrelevant detection and find the best match.

Fig. 9. IoU regression errors, GIoU losses are highlighted. (a) Bgt is the ground-truth and B is the predicted bounding box.(b) B intersection Bgt

MODEL TRAINING

Training Setting

RESULT AND DISCUSSION

Model evaluation metrics

Precision and recall rates
Frames per second and inference time

Training results and analysis

The YOLOv5s model can be trained in Fig.12 and the different backgrounds are considered as a variable in this study. The PR curve of the model can achieve a complete accuracy of 0.794 percent for vehicle detection. The training model for the loss curve shows case loss, obj loss and class loss.

Loss functions by mispredicting the constancy of the boxes and objects to specify the correct one. The box_loss of the training model finds the best accuracy of the bounding box regression, detecting the vehicle accurately.

Detecting result and discussion

According to the results of the image detection test set, the large objects of YOLOv5 perform better. As a result of using YOLOv5s, higher performance is obtained and the speed of inference time and FPS can be identified extremely quickly.

CONCLUSION

Song, "Vision-based vehicle detection and counting system using deep learning in highway scenes.", European Transport Research Review, p. Lefèvre, "Segment-before-detect: Vehicle detection and classification through semantic segmentation of aerial images," Remote Sens , vol. Wei, “Missing data reconstruction in a remote sensing image with a uniform spatio-temporal-spectral deep convolutional neural network,” IEEE Trans.

Darrell, "Fully convolutional networks for semantic segmentation," In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 8-10 June, p. Nielsen, "The regularized iteratively reweighted mad method for change detection in multi- and hyperspectral data," IEEE Transactions on Image processing, no. Li, "Detection of hidden cracks from ground-penetrating radar images based on a deep learning algorithm.", Construction and Building Materials, p.

Lin, "Focal loss for dense object detection.," In Proceedings of the IEEE international conference on computer vision, pp. Duan, " Centernet: Keypoint triplets for object detection.," In Proceedings of the IEEE/CVF International Conference on Computer Vision. , pp Chun D., "A study for selecting the best one-stage detector for autonomous driving," Proceedings of the International Technical Conference on Circuits / Systems, Computers and Communications (ITC-CSCC); JeJu, Korea, pp.

Zhang, “Real-time detection method for small traffic signs based on yolov3,” IEEE Access, p. (ISITIA), p.